subject:"Telemetry Policy"

Re: Telemetry Policy - Remaining Questions

2018-05-14 Thread Jaroslaw Staniek

On 30 April 2018 at 22:54, Lydia Pintscher  wrote:

> On Mon, Apr 30, 2018 at 10:41 PM Jaroslaw Staniek  wrote:
> > Hello
> > Now we can assume that solution to non-unique identification Volker
> explained in acceptable equivalent of random identifiers so KEXI does not
> need exception.
> > Thanks for patience!
>
> > I understand KEXI has time until the next release to switch to
> KUserFeedback. In other words, next non-patch release (3.2) would be
> compliant and would store data within KDE infra. For 3.1.x we can stop
> saving unique random identifiers.
>
> Perfect. Thank you!
>

Update: as a first step, I disabled collecting *any* data on
kexi-project.org so it's not related to the GDPR matters at all.

-- 
regards, Jaroslaw Staniek

KDE:
: A world-wide network of software engineers, artists, writers, translators
: and facilitators committed to Free Software development - http://kde.org
KEXI:
: A visual database apps builder - http://calligra.org/kexi
  http://twitter.com/kexi_project https://facebook.com/kexi.project
Qt Certified Specialist:
: http://www.linkedin.com/in/jstaniek

Re: Telemetry Policy - Remaining Questions

2018-04-30 Thread Lydia Pintscher

On Mon, Apr 30, 2018 at 10:41 PM Jaroslaw Staniek  wrote:
> Hello
> Now we can assume that solution to non-unique identification Volker
explained in acceptable equivalent of random identifiers so KEXI does not
need exception.
> Thanks for patience!

> I understand KEXI has time until the next release to switch to
KUserFeedback. In other words, next non-patch release (3.2) would be
compliant and would store data within KDE infra. For 3.1.x we can stop
saving unique random identifiers.

Perfect. Thank you!


Cheers
Lydia

-- 
Lydia Pintscher - http://about.me/lydia.pintscher
KDE e.V. Board of Directors
http://kde.org - http://open-advice.org

Re: Telemetry Policy - Remaining Questions

2018-04-30 Thread Jaroslaw Staniek

On 30 April 2018 at 22:08, Lydia Pintscher  wrote:

> Hey folks,
>
> Jaroslaw:
> * Given that
> 
> GDPR is coming into effect on May 25th I'd like to urge you to
> look into if what you're currently tracking is acceptable under that
> regulation. I don't know how where the data you're collecting currently
> ends up but I don't want the e.V. to be liable for personally identifiable
> information ending up on non-e.V. servers etc.
> * Please make a choice if you're willing to change Kexi to comply with the
> policy or if you prefer it to be marked as a historic exception. (My
> preference is not to have any exceptions to the policy and make a clear
> statement to our users.)
>

Hello
Now we can assume that solution to non-unique identification Volker
explained in acceptable equivalent of random identifiers so KEXI does not
need exception.
Thanks for patience!

I understand KEXI has time until the next release to switch
to KUserFeedback. In other words, next non-patch release (3.2) would be
compliant and would store data within KDE infra. For 3.1.x we can stop
saving unique random identifiers.



> Everyone: Unless there are big objections within the next week let's
> consider the current draft at
> https://community.kde.org/Policies/Telemetry_Policy accepted.
>
>
> Cheers
> Lydia
>
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> KDE e.V. Board of Directors
> http://kde.org - http://open-advice.org
>



-- 
regards, Jaroslaw Staniek

KDE:
: A world-wide network of software engineers, artists, writers, translators
: and facilitators committed to Free Software development - http://kde.org
KEXI:
: A visual database apps builder - http://calligra.org/kexi
  http://twitter.com/kexi_project https://facebook.com/kexi.project
Qt Certified Specialist:
: http://www.linkedin.com/in/jstaniek

Re: Telemetry Policy - Remaining Questions

2018-04-30 Thread Lydia Pintscher

Hey folks,

Jaroslaw:
* Given that GDPR is coming into effect on May 25th I'd like to urge you to
look into if what you're currently tracking is acceptable under that
regulation. I don't know how where the data you're collecting currently
ends up but I don't want the e.V. to be liable for personally identifiable
information ending up on non-e.V. servers etc.
* Please make a choice if you're willing to change Kexi to comply with the
policy or if you prefer it to be marked as a historic exception. (My
preference is not to have any exceptions to the policy and make a clear
statement to our users.)

Everyone: Unless there are big objections within the next week let's
consider the current draft at
https://community.kde.org/Policies/Telemetry_Policy accepted.


Cheers
Lydia

-- 
Lydia Pintscher - http://about.me/lydia.pintscher
KDE e.V. Board of Directors
http://kde.org - http://open-advice.org

Re: Telemetry Policy - Remaining Questions

2018-04-04 Thread Jaroslaw Staniek

On 4 April 2018 at 12:37, Ben Cooksley  wrote:

> On Tue, Apr 3, 2018 at 8:57 PM, Jaroslaw Staniek  wrote:
> >
> >
> > On 3 April 2018 at 10:17, Ben Cooksley  wrote:
> >>
> >> On Tue, Apr 3, 2018 at 11:20 AM, Jaroslaw Staniek 
> wrote:
> >> >
> >> >
> >> > On 2 April 2018 at 22:56, Lydia Pintscher  wrote:
> >> >>
> >> >> Hey Jaroslaw :)
> >> >>
> >> >> On Mon, Apr 2, 2018 at 10:28 PM, Jaroslaw Staniek 
> >> >> wrote:
> >> >> > Thanks for reminding me Lydia
> >> >> >
> >> >> > I've not forgotten this. While there's progress I do still see this
> >> >> > as a
> >> >> > pilot stage and do not think we're in a hurry given telemetry is
> >> >> > something
> >> >> > "extra" for a project development, not a core feature of any
> product.
> >> >>
> >> >> We are in a hurry now. We're waiting for projects to be able to start
> >> >> using it and get us valuable insights about how our software is used.
> >> >> We've been on it since last Akademy. Let's get it finished :)
> >> >>
> >> >> > Below I am referring to this version:
> >> >> >
> >> >> >
> >> >> > https://community.kde.org/index.php?title=Policies/
> Telemetry_Policy&oldid=78057
> >> >> >
> >> >> > tl;dr: Why discussing: Any deep change and limitation to projects'
> >> >> > freedom
> >> >> > needs to bring substantial benefits over drawbacks. Level of
> >> >> > complexity
> >> >> > of
> >> >> > the contract for a project or individual developer needs to be
> >> >> > balanced
> >> >> > by
> >> >> > real (not hypothetical) benefits.
> >> >>
> >> >> The benefits here for KDE are:
> >> >> * we have a
> >> >> better understanding of our userbase leading hopefully to
> >> >> better software
> >> >> * we have a better understanding of our userbase leading hopefully to
> >> >> better marketing
> >> >> * we have a clear policy we can point our users to that explains how
> >> >> we are handling their data and that is in line with our vision/what
> we
> >> >> stand for.
> >> >>
> >> >> > I've studied the wiki page more in depth and I have these points
> >> >> > where
> >> >> > I'd
> >> >> > like to see improvement. This is based on my experience, not a list
> >> >> > of
> >> >> > quick
> >> >> > ideas.
> >> >> >
> >> >> >
> >> >> https://community.kde.org/Talk:Policies/Telemetry_Policy#
> >> >>
> >> >> Thank you! Volker is probably best equipped to answer these.
> >> >>
> >> >> > That said: I will nod to the concept of "Minimalism", it is all
> >> >> > classic
> >> >> > property of telemetry. I think I've seen them in other projects
> too.
> >> >> > I'd just say, let's not make all this more limited than anyone
> wants
> >> >> > it
> >> >> > to
> >> >> > be.
> >> >>
> >> >> Where is it too limited? Please keep in mind that we've set
> >> >> privacy as
> >> >> a core part of our vision and the current goals.
> >> >
> >> >
> >> > Lydia,
> >>
> >> Hi Jaroslaw,
> >>
> >> > It's a core part but still a part and can't contradict, say, with the
> >> > Freedom part.
> >> >
> >> > Please see the list of limitations:
> >> >  https://community.kde.org/Talk:Policies/Telemetry_Policy#
> >> > (in my opinion that's not a "nice to haves" but requirements needed so
> >> > we
> >> > can even call the whole thing "telemetry")
> >> >
> >> > I am asking for an alternative approaches, Volker once mentioned there
> >> > are
> >> > some.
> >> > We need them to we move forward.
> >> >
> >> > In the meantime my stack runs just well, people that use IDs are even
> >> > given
> >> > right to remove their data, something that's *not* going to be
> possible
> >> > with
> >> > the proposed vision. Someone would convince me otherwise.
> >>
> >> Please don't drag our websites ability to have people login to them
> >> into your argument here.
> >> Cookies as used by websites are quite different to Telemetry on many
> >> points.
> >
> >
> > Dear Ben, based on your experience I'd like to hear your voice how web
> apps
> > of any kind are different or are special cases, compared to apps that
> happen
> > to do the same but do not use the "web" stamp so discussed data
> collection
> > features are delegate to 3rd-party clients called web browsers.
> > How an OPT-IN ID like 2a7c819f-636c-403e-afa1-c9e37031c1de based on
> random
> > generator[1] is more serious privacy concern than required
> > (login+email+password) non-anonymized tuple for web accounts of web apps
> of
> > any kind. Please do not take this as pointing to any core
> infrastructure, I
> > am pointing to specific established technology and practices.
>
> Web applications (as we deploy anyway) are a bit different as the
> action of registering, and then logging in, requires specific and
> deliberate engagement on the users part while the Opt-In process used
> by applications could be as simple as a popup on first startup, or a
> checkbox in it's configuration (therefore making the required effort
> much lower). If at any point a user is not logged in, we have no idea
> who they are until they login (and many of our sites do not send any
> cooki

Re: Telemetry Policy - Remaining Questions

2018-04-04 Thread Ben Cooksley

On Tue, Apr 3, 2018 at 8:57 PM, Jaroslaw Staniek  wrote:
>
>
> On 3 April 2018 at 10:17, Ben Cooksley  wrote:
>>
>> On Tue, Apr 3, 2018 at 11:20 AM, Jaroslaw Staniek  wrote:
>> >
>> >
>> > On 2 April 2018 at 22:56, Lydia Pintscher  wrote:
>> >>
>> >> Hey Jaroslaw :)
>> >>
>> >> On Mon, Apr 2, 2018 at 10:28 PM, Jaroslaw Staniek 
>> >> wrote:
>> >> > Thanks for reminding me Lydia
>> >> >
>> >> > I've not forgotten this. While there's progress I do still see this
>> >> > as a
>> >> > pilot stage and do not think we're in a hurry given telemetry is
>> >> > something
>> >> > "extra" for a project development, not a core feature of any product.
>> >>
>> >> We are in a hurry now. We're waiting for projects to be able to start
>> >> using it and get us valuable insights about how our software is used.
>> >> We've been on it since last Akademy. Let's get it finished :)
>> >>
>> >> > Below I am referring to this version:
>> >> >
>> >> >
>> >> > https://community.kde.org/index.php?title=Policies/Telemetry_Policy&oldid=78057
>> >> >
>> >> > tl;dr: Why discussing: Any deep change and limitation to projects'
>> >> > freedom
>> >> > needs to bring substantial benefits over drawbacks. Level of
>> >> > complexity
>> >> > of
>> >> > the contract for a project or individual developer needs to be
>> >> > balanced
>> >> > by
>> >> > real (not hypothetical) benefits.
>> >>
>> >> The benefits here for KDE are:
>> >> * we have a
>> >> better understanding of our userbase leading hopefully to
>> >> better software
>> >> * we have a better understanding of our userbase leading hopefully to
>> >> better marketing
>> >> * we have a clear policy we can point our users to that explains how
>> >> we are handling their data and that is in line with our vision/what we
>> >> stand for.
>> >>
>> >> > I've studied the wiki page more in depth and I have these points
>> >> > where
>> >> > I'd
>> >> > like to see improvement. This is based on my experience, not a list
>> >> > of
>> >> > quick
>> >> > ideas.
>> >> >
>> >> >
>> >> https://community.kde.org/Talk:Policies/Telemetry_Policy#
>> >>
>> >> Thank you! Volker is probably best equipped to answer these.
>> >>
>> >> > That said: I will nod to the concept of "Minimalism", it is all
>> >> > classic
>> >> > property of telemetry. I think I've seen them in other projects too.
>> >> > I'd just say, let's not make all this more limited than anyone wants
>> >> > it
>> >> > to
>> >> > be.
>> >>
>> >> Where is it too limited? Please keep in mind that we've set
>> >> privacy as
>> >> a core part of our vision and the current goals.
>> >
>> >
>> > Lydia,
>>
>> Hi Jaroslaw,
>>
>> > It's a core part but still a part and can't contradict, say, with the
>> > Freedom part.
>> >
>> > Please see the list of limitations:
>> >  https://community.kde.org/Talk:Policies/Telemetry_Policy#
>> > (in my opinion that's not a "nice to haves" but requirements needed so
>> > we
>> > can even call the whole thing "telemetry")
>> >
>> > I am asking for an alternative approaches, Volker once mentioned there
>> > are
>> > some.
>> > We need them to we move forward.
>> >
>> > In the meantime my stack runs just well, people that use IDs are even
>> > given
>> > right to remove their data, something that's *not* going to be possible
>> > with
>> > the proposed vision. Someone would convince me otherwise.
>>
>> Please don't drag our websites ability to have people login to them
>> into your argument here.
>> Cookies as used by websites are quite different to Telemetry on many
>> points.
>
>
> Dear Ben, based on your experience I'd like to hear your voice how web apps
> of any kind are different or are special cases, compared to apps that happen
> to do the same but do not use the "web" stamp so discussed data collection
> features are delegate to 3rd-party clients called web browsers.
> How an OPT-IN ID like 2a7c819f-636c-403e-afa1-c9e37031c1de based on random
> generator[1] is more serious privacy concern than required
> (login+email+password) non-anonymized tuple for web accounts of web apps of
> any kind. Please do not take this as pointing to any core infrastructure, I
> am pointing to specific established technology and practices.

Web applications (as we deploy anyway) are a bit different as the
action of registering, and then logging in, requires specific and
deliberate engagement on the users part while the Opt-In process used
by applications could be as simple as a popup on first startup, or a
checkbox in it's configuration (therefore making the required effort
much lower). If at any point a user is not logged in, we have no idea
who they are until they login (and many of our sites do not send any
cookies until you try logging in)

Additionally, the only information we collect from users is that which
they deliberately enter in (and have therefore chosen to provide to
us). We also don't record any viewing activity on our sites - only
actions which change the site (such as posting a bug, editing a wiki
page or commenting

Re: Telemetry Policy - Remaining Questions

2018-04-04 Thread Volker Krause

On Tuesday, 3 April 2018 18:16:10 CEST Jaroslaw Staniek wrote:
> On 3 April 2018 at 10:42, Volker Krause  wrote:
> > > > https://community.kde.org/Talk:Policies/Telemetry_Policy#
> > > 
> > > Thank you! Volker is probably best equipped to answer these.
> > 
> > I've commented on all points on the talk page now I think.
> 
> Thanks Volker,
> I must have missed the info that the concepts of intervals have been
> documented two weeks ago. Apologies.
> The concept has potential to change the game and convince people to
> "invest" in telemetry. However maybe a pilot phase with an app or two would
> be more in place to see the results. Policies would then follow reality
> even more.

The agreement at Akademy 2017 was to not do that without regulation, which is 
what got us here.

> It would be good to consider picking larger projects, e.g. part of KDE
> Applications release or Plasma itself, and with maintainers able to discuss
> during the Akademy. 

Right. That's what happened last year, that's why we are having the policy 
draft and this discussion.

> KEXI is just not in this group. Also, based on my
> knowledge, statistical KEXI users would not even have access to the KDE
> "kill switch" since Plasma is rarely used by them.

That's why the policy says "[...] where they exist.". Obviously if you are 
running on a system without such settings there is nothing you can do, so you 
just follow per-application settings there.

> In particular there is a chance that if pilot apps get released with the
> KUserFeedback telemetry, the challenge of handling access to the raw
> anonymized data would get addressed, since we would know why there is the
> interest in data and what is the purpose of collecting it. In the
> post-Akademy thread Ingo Klöcker has once provided interesting general
> observation, even it it refers to German law. Once the data "leaks" outside
> of the telemetry team (say telemetry team within a single app project), to
> people not even affiliated with the project, it is no longer possible to
> expect it (the data) is used in a way compliant with the Policy, and
> only for the needs of "legal" telemetry. KDE would still be responsible for
> all the uses of the data. Coming year stronger law comes to Poland too.

That's where the motivation to make the result "data" rather than "personal 
data" comes from. Then leaking/sharing/etc is not a problem. As soon as there 
is anything in there considered "personal data" (as defined by data protection 
regulation), you are right, we need access control, data removal procedures, 
limited storage time, etc, and come into scope of GDPR & friends. That is why 
allowing a unique identifier makes such a big difference to all of this.

> So making it harder to download the data and increasing level of its
> aggregation would make sense then to me. Long-term idea would also
> be giving access to an analysis tool and the results instead of to the data
> on which the tool operates.
> 
> 
> And then in the meantime folks like the Promo team have potential to work
> on proper messaging to improve the number users understanding and accepting
> telemetry. My example follows here. On more "human" level: KEXI offering a
> telemetry "for KDE" that is still seen as a "desktop" (regardless of a
> reason) than "for KEXI Team" would be potentially a disadvantage for the
> app's project itself. KDE contributors and fans understand the idea of an
> organization. I've learned from my group that for everyone else "Help
> improve Visual Studio" type of message tends convince more than more
> questionable "Help Microsoft" ;)

How to communicate things to the user is an important topic, and probably has 
no unified answer. I indeed can see that being different for say KEXI and 
Plasma. However, I do think it's useful for this to be able to point to strong 
regulation that ensures privacy, control and transparency.

Regards,
Volker

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy - Remaining Questions

2018-04-03 Thread Jaroslaw Staniek

On 3 April 2018 at 10:42, Volker Krause  wrote:

> Thanks Lydia for getting this moving again!
>
> On Monday, 2 April 2018 22:56:31 CEST Lydia Pintscher wrote:
> > Hey Jaroslaw :)
> >
> > On Mon, Apr 2, 2018 at 10:28 PM, Jaroslaw Staniek 
> wrote:
> > > Thanks for reminding me Lydia
> > >
> > > I've not forgotten this. While there's progress I do still see this as
> a
> > > pilot stage and do not think we're in a hurry given telemetry is
> something
> > > "extra" for a project development, not a core feature of any product.
> >
> > We are in a hurry now. We're waiting for projects to be able to start
> > using it and get us valuable insights about how our software is used.
> > We've been on it since last Akademy. Let's get it finished :)
> >
> > > Below I am referring to this version:
> > > https://community.kde.org/index.php?title=Policies/Telemetry
> _Policy&oldid=
> > > 78057
> > >
> > > tl;dr: Why discussing: Any deep change and limitation to projects'
> freedom
> > > needs to bring substantial benefits over drawbacks. Level of
> complexity of
> > > the contract for a project or individual developer needs to be
> balanced by
> > > real (not hypothetical) benefits.
> >
> > The benefits here for KDE are:
> > * we have a better understanding of our userbase leading hopefully to
> > better software
> > * we have a better understanding of our userbase leading hopefully to
> > better marketing
> > * we have a clear policy we can point our users to that explains how
> > we are handling their data and that is in line with our vision/what we
> > stand for.
> >
> > > I've studied the wiki page more in depth and I have these points where
> I'd
> > > like to see improvement. This is based on my experience, not a list of
> > > quick ideas.
> > >
> > > https://community.kde.org/Talk:Policies/Telemetry_Policy#
> >
> > Thank you! Volker is probably best equipped to answer these.
>
> I've commented on all points on the talk page now I think.
>

Thanks Volker,
I must have missed the info that the concepts of intervals have been
documented two weeks ago. Apologies.
The concept has potential to change the game and convince people to
"invest" in telemetry. However maybe a pilot phase with an app or two would
be more in place to see the results. Policies would then follow reality
even more.

It would be good to consider picking larger projects, e.g. part of KDE
Applications release or Plasma itself, and with maintainers able to discuss
during the Akademy. KEXI is just not in this group. Also, based on my
knowledge, statistical KEXI users would not even have access to the KDE
"kill switch" since Plasma is rarely used by them.

In particular there is a chance that if pilot apps get released with the
KUserFeedback telemetry, the challenge of handling access to the raw
anonymized data would get addressed, since we would know why there is the
interest in data and what is the purpose of collecting it. In the
post-Akademy thread Ingo Klöcker has once provided interesting general
observation, even it it refers to German law. Once the data "leaks" outside
of the telemetry team (say telemetry team within a single app project), to
people not even affiliated with the project, it is no longer possible to
expect it (the data) is used in a way compliant with the Policy, and
only for the needs of "legal" telemetry. KDE would still be responsible for
all the uses of the data. Coming year stronger law comes to Poland too.

So making it harder to download the data and increasing level of its
aggregation would make sense then to me. Long-term idea would also
be giving access to an analysis tool and the results instead of to the data
on which the tool operates.

And then in the meantime folks like the Promo team have potential to work
on proper messaging to improve the number users understanding and accepting
telemetry. My example follows here. On more "human" level: KEXI offering a
telemetry "for KDE" that is still seen as a "desktop" (regardless of a
reason) than "for KEXI Team" would be potentially a disadvantage for the
app's project itself. KDE contributors and fans understand the idea of an
organization. I've learned from my group that for everyone else "Help
improve Visual Studio" type of message tends convince more than more
questionable "Help Microsoft" ;)

-- 
regards, Jaroslaw Staniek

KDE:
: A world-wide network of software engineers, artists, writers, translators
: and facilitators committed to Free Software development - http://kde.org
KEXI:
: A visual database apps builder - http://calligra.org/kexi
  http://twitter.com/kexi_project https://facebook.com/kexi.project
Qt Certified Specialist:
: http://www.linkedin.com/in/jstaniek

Re: Telemetry Policy - Remaining Questions

2018-04-03 Thread Jaroslaw Staniek

On 3 April 2018 at 10:17, Ben Cooksley  wrote:

> On Tue, Apr 3, 2018 at 11:20 AM, Jaroslaw Staniek  wrote:
> >
> >
> > On 2 April 2018 at 22:56, Lydia Pintscher  wrote:
> >>
> >> Hey Jaroslaw :)
> >>
> >> On Mon, Apr 2, 2018 at 10:28 PM, Jaroslaw Staniek 
> wrote:
> >> > Thanks for reminding me Lydia
> >> >
> >> > I've not forgotten this. While there's progress I do still see this
> as a
> >> > pilot stage and do not think we're in a hurry given telemetry is
> >> > something
> >> > "extra" for a project development, not a core feature of any product.
> >>
> >> We are in a hurry now. We're waiting for projects to be able to start
> >> using it and get us valuable insights about how our software is used.
> >> We've been on it since last Akademy. Let's get it finished :)
> >>
> >> > Below I am referring to this version:
> >> >
> >> > https://community.kde.org/index.php?title=Policies/
> Telemetry_Policy&oldid=78057
> >> >
> >> > tl;dr: Why discussing: Any deep change and limitation to projects'
> >> > freedom
> >> > needs to bring substantial benefits over drawbacks. Level of
> complexity
> >> > of
> >> > the contract for a project or individual developer needs to be
> balanced
> >> > by
> >> > real (not hypothetical) benefits.
> >>
> >> The benefits here for KDE are:
> >> * we have a
> >> better understanding of our userbase leading hopefully to
> >> better software
> >> * we have a better understanding of our userbase leading hopefully to
> >> better marketing
> >> * we have a clear policy we can point our users to that explains how
> >> we are handling their data and that is in line with our vision/what we
> >> stand for.
> >>
> >> > I've studied the wiki page more in depth and I have these points where
> >> > I'd
> >> > like to see improvement. This is based on my experience, not a list of
> >> > quick
> >> > ideas.
> >> >
> >> >
> >> https://community.kde.org/Talk:Policies/Telemetry_Policy#
> >>
> >> Thank you! Volker is probably best equipped to answer these.
> >>
> >> > That said: I will nod to the concept of "Minimalism", it is all
> classic
> >> > property of telemetry. I think I've seen them in other projects too.
> >> > I'd just say, let's not make all this more limited than anyone wants
> it
> >> > to
> >> > be.
> >>
> >> Where is it too limited? Please keep in mind that we've set
> >> privacy as
> >> a core part of our vision and the current goals.
> >
> >
> > Lydia,
>
> Hi Jaroslaw,
>
> > It's a core part but still a part and can't contradict, say, with the
> > Freedom part.
> >
> > Please see the list of limitations:
> >  https://community.kde.org/Talk:Policies/Telemetry_Policy#
> > (in my opinion that's not a "nice to haves" but requirements needed so we
> > can even call the whole thing "telemetry")
> >
> > I am asking for an alternative approaches, Volker once mentioned there
> are
> > some.
> > We need them to we move forward.
> >
> > In the meantime my stack runs just well, people that use IDs are even
> given
> > right to remove their data, something that's *not* going to be possible
> with
> > the proposed vision. Someone would convince me otherwise.
>
> Please don't drag our websites ability to have people login to them
> into your argument here.
> Cookies as used by websites are quite different to Telemetry on many
> points.
>

Dear Ben, based on your experience I'd like to hear your voice how web
apps of any kind are different or are special cases, compared to apps that
happen to do the same but do not use the "web" stamp so discussed data
collection features are delegate to 3rd-party clients called web browsers.
How an OPT-IN ID like 2a7c819f-636c-403e-afa1-c9e37031c1de based on random
generator[1] is more serious privacy concern than required
(login+email+password) non-anonymized tuple for web accounts of web apps of
any kind. Please do not take this as pointing to any core infrastructure, I
am pointing to specific established technology and practices.

Then do we agree that the purpose of random ID collection is secondary as
long as both sides know it and agree on the terms of collaboration? And
even: can pull the data out.

I am calling functional web sites as apps, produced by any KDE projects,
hoping that's not seen as dragging. Please do not look at my concern as a
criticism towards the web apps because in my opinion apps of any technology
have right to use anonymized unique IDs at user's consent for purposes
clearly stated to the user to achieve openly explained goals welcomed by
the users. Or from a different angle, I see nothing in Freedom that
prohibits Free projects to offer such features to Free users
unconditionally[2].

[1] If separating of independent aspects measured is a concern (e.g.
[screen size] from [locale]) unique user can have multiple IDs generated,
one per single analyzed group of aspects, to fully decouple one area from
another in the raw data (as in example: separating screen size analysis
from locale analysis).
[2] Unconditionally == as stated by

Re: Telemetry Policy - Remaining Questions

2018-04-03 Thread Volker Krause

Thanks Lydia for getting this moving again!

On Monday, 2 April 2018 22:56:31 CEST Lydia Pintscher wrote:
> Hey Jaroslaw :)
> 
> On Mon, Apr 2, 2018 at 10:28 PM, Jaroslaw Staniek  wrote:
> > Thanks for reminding me Lydia
> > 
> > I've not forgotten this. While there's progress I do still see this as a
> > pilot stage and do not think we're in a hurry given telemetry is something
> > "extra" for a project development, not a core feature of any product.
> 
> We are in a hurry now. We're waiting for projects to be able to start
> using it and get us valuable insights about how our software is used.
> We've been on it since last Akademy. Let's get it finished :)
> 
> > Below I am referring to this version:
> > https://community.kde.org/index.php?title=Policies/Telemetry_Policy&oldid=
> > 78057
> > 
> > tl;dr: Why discussing: Any deep change and limitation to projects' freedom
> > needs to bring substantial benefits over drawbacks. Level of complexity of
> > the contract for a project or individual developer needs to be balanced by
> > real (not hypothetical) benefits.
> 
> The benefits here for KDE are:
> * we have a better understanding of our userbase leading hopefully to
> better software
> * we have a better understanding of our userbase leading hopefully to
> better marketing
> * we have a clear policy we can point our users to that explains how
> we are handling their data and that is in line with our vision/what we
> stand for.
> 
> > I've studied the wiki page more in depth and I have these points where I'd
> > like to see improvement. This is based on my experience, not a list of
> > quick ideas.
> > 
> > https://community.kde.org/Talk:Policies/Telemetry_Policy#
> 
> Thank you! Volker is probably best equipped to answer these.

I've commented on all points on the talk page now I think.

Regards,
Volker

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy - Remaining Questions

2018-04-03 Thread Ben Cooksley

On Tue, Apr 3, 2018 at 11:20 AM, Jaroslaw Staniek  wrote:
>
>
> On 2 April 2018 at 22:56, Lydia Pintscher  wrote:
>>
>> Hey Jaroslaw :)
>>
>> On Mon, Apr 2, 2018 at 10:28 PM, Jaroslaw Staniek  wrote:
>> > Thanks for reminding me Lydia
>> >
>> > I've not forgotten this. While there's progress I do still see this as a
>> > pilot stage and do not think we're in a hurry given telemetry is
>> > something
>> > "extra" for a project development, not a core feature of any product.
>>
>> We are in a hurry now. We're waiting for projects to be able to start
>> using it and get us valuable insights about how our software is used.
>> We've been on it since last Akademy. Let's get it finished :)
>>
>> > Below I am referring to this version:
>> >
>> > https://community.kde.org/index.php?title=Policies/Telemetry_Policy&oldid=78057
>> >
>> > tl;dr: Why discussing: Any deep change and limitation to projects'
>> > freedom
>> > needs to bring substantial benefits over drawbacks. Level of complexity
>> > of
>> > the contract for a project or individual developer needs to be balanced
>> > by
>> > real (not hypothetical) benefits.
>>
>> The benefits here for KDE are:
>> * we have a
>> better understanding of our userbase leading hopefully to
>> better software
>> * we have a better understanding of our userbase leading hopefully to
>> better marketing
>> * we have a clear policy we can point our users to that explains how
>> we are handling their data and that is in line with our vision/what we
>> stand for.
>>
>> > I've studied the wiki page more in depth and I have these points where
>> > I'd
>> > like to see improvement. This is based on my experience, not a list of
>> > quick
>> > ideas.
>> >
>> >
>> https://community.kde.org/Talk:Policies/Telemetry_Policy#
>>
>> Thank you! Volker is probably best equipped to answer these.
>>
>> > That said: I will nod to the concept of "Minimalism", it is all classic
>> > property of telemetry. I think I've seen them in other projects too.
>> > I'd just say, let's not make all this more limited than anyone wants it
>> > to
>> > be.
>>
>> Where is it too limited? Please keep in mind that we've set
>> privacy as
>> a core part of our vision and the current goals.
>
>
> Lydia,

Hi Jaroslaw,

> It's a core part but still a part and can't contradict, say, with the
> Freedom part.
>
> Please see the list of limitations:
>  https://community.kde.org/Talk:Policies/Telemetry_Policy#
> (in my opinion that's not a "nice to haves" but requirements needed so we
> can even call the whole thing "telemetry")
>
> I am asking for an alternative approaches, Volker once mentioned there are
> some.
> We need them to we move forward.
>
> In the meantime my stack runs just well, people that use IDs are even given
> right to remove their data, something that's *not* going to be possible with
> the proposed vision. Someone would convince me otherwise.

Please don't drag our websites ability to have people login to them
into your argument here.
Cookies as used by websites are quite different to Telemetry on many points.

Regards,
Ben

>
> --
> regards, Jaroslaw Staniek
>
> KDE:
> : A world-wide network of software engineers, artists, writers, translators
> : and facilitators committed to Free Software development - http://kde.org
> KEXI:
> : A visual database apps builder - http://calligra.org/kexi
>   http://twitter.com/kexi_project https://facebook.com/kexi.project
> Qt Certified Specialist:
> : http://www.linkedin.com/in/jstaniek

Re: Telemetry Policy - Remaining Questions

2018-04-02 Thread Jaroslaw Staniek

On 2 April 2018 at 22:56, Lydia Pintscher  wrote:

> Hey Jaroslaw :)
>
> On Mon, Apr 2, 2018 at 10:28 PM, Jaroslaw Staniek  wrote:
> > Thanks for reminding me Lydia
> >
> > I've not forgotten this. While there's progress I do still see this as a
> > pilot stage and do not think we're in a hurry given telemetry is
> something
> > "extra" for a project development, not a core feature of any product.
>
> We are in a hurry now. We're waiting for projects to be able to start
> using it and get us valuable insights about how our software is used.
> We've been on it since last Akademy. Let's get it finished :)
>
> > Below I am referring to this version:
> > https://community.kde.org/index.php?title=Policies/Telemetry
> _Policy&oldid=78057
> >
> > tl;dr: Why discussing: Any deep change and limitation to projects'
> freedom
> > needs to bring substantial benefits over drawbacks. Level of complexity
> of
> > the contract for a project or individual developer needs to be balanced
> by
> > real (not hypothetical) benefits.
>
> The benefits here for KDE are:
> * we have a
> 
> better understanding of our userbase leading hopefully to
> better software
> * we have a better understanding of our userbase leading hopefully to
> better marketing
> * we have a clear policy we can point our users to that explains how
> we are handling their data and that is in line with our vision/what we
> stand for.
>
> > I've studied the wiki page more in depth and I have these points where
> I'd
> > like to see improvement. This is based on my experience, not a list of
> quick
> > ideas.
> >
> >
> 
> https://community.kde.org/Talk:Policies/Telemetry_Policy#
>
> Thank you! Volker is probably best equipped to answer these.
>
> > That said: I will nod to the concept of "Minimalism", it is all classic
> > property of telemetry. I think I've seen them in other projects too.
> > I'd just say, let's not make all this more limited than anyone wants it
> to
> > be.
>
> Where is it too limited? Please keep in mind that we've set
> 
> privacy as
> a core part of our vision and the current goals.
>

Lydia,
It's a core part but still a part and can't contradict, say, with the
Freedom part.

Please see the list of limitations:

 https://community.kde.org/Talk:Policies/Telemetry_Policy#
(in my opinion that's not a "nice to haves" but requirements needed so we
can even call the whole thing "telemetry")

I am asking for an alternative approaches, Volker once mentioned there are
some.
We need them to we move forward.

In the meantime my stack runs just well, people that use IDs are even given
right to remove their data, something that's *not* going to be possible
with the proposed vision. Someone would convince me otherwise.

-- 
regards, Jaroslaw Staniek

KDE:
: A world-wide network of software engineers, artists, writers, translators
: and facilitators committed to Free Software development - http://kde.org
KEXI:
: A visual database apps builder - http://calligra.org/kexi
  http://twitter.com/kexi_project https://facebook.com/kexi.project
Qt Certified Specialist:
: http://www.linkedin.com/in/jstaniek

Re: Telemetry Policy - Remaining Questions

2018-04-02 Thread Lydia Pintscher

Hey Jaroslaw :)

On Mon, Apr 2, 2018 at 10:28 PM, Jaroslaw Staniek  wrote:
> Thanks for reminding me Lydia
>
> I've not forgotten this. While there's progress I do still see this as a
> pilot stage and do not think we're in a hurry given telemetry is something
> "extra" for a project development, not a core feature of any product.

We are in a hurry now. We're waiting for projects to be able to start
using it and get us valuable insights about how our software is used.
We've been on it since last Akademy. Let's get it finished :)

> Below I am referring to this version:
> https://community.kde.org/index.php?title=Policies/Telemetry_Policy&oldid=78057
>
> tl;dr: Why discussing: Any deep change and limitation to projects' freedom
> needs to bring substantial benefits over drawbacks. Level of complexity of
> the contract for a project or individual developer needs to be balanced by
> real (not hypothetical) benefits.

The benefits here for KDE are:
* we have a better understanding of our userbase leading hopefully to
better software
* we have a better understanding of our userbase leading hopefully to
better marketing
* we have a clear policy we can point our users to that explains how
we are handling their data and that is in line with our vision/what we
stand for.

> I've studied the wiki page more in depth and I have these points where I'd
> like to see improvement. This is based on my experience, not a list of quick
> ideas.
>
> https://community.kde.org/Talk:Policies/Telemetry_Policy#

Thank you! Volker is probably best equipped to answer these.

> That said: I will nod to the concept of "Minimalism", it is all classic
> property of telemetry. I think I've seen them in other projects too.
> I'd just say, let's not make all this more limited than anyone wants it to
> be.

Where is it too limited? Please keep in mind that we've set privacy as
a core part of our vision and the current goals.

Cheers
Lydia

-- 
Lydia Pintscher - http://about.me/lydia.pintscher
KDE e.V. Board of Directors
http://kde.org - http://open-advice.org

Re: Telemetry Policy - Remaining Questions

2018-04-02 Thread Jaroslaw Staniek

On 1 April 2018 at 15:41, Lydia Pintscher  wrote:

> Hey folks :)
>
> We really need to wrap this up now.
> 
> We need the data and we need the
> policy in place. It's holding us back from learning more about our
> users and making our software better. That's not good.
>
> On Sat, Nov 11, 2017 at 12:47 PM, Volker Krause  wrote:
> > So, I see the following possible ways forward:
> > (1) We accept the policy in its current spirit, and Kexi complies with
> it (if
> > necessary after some transition period).
>
> This would be the ideal way forward and the right thing to do.
>
> > (2) We accept the policy in its current spirit, and Kexi is exempt from
> it.
>
> As sebas said this is bad communication-wise. But if Kexi can't or
> doesn't want to comply that's the next best option.
>
> > (3) We make the policy opt-in, ie. using it merely as an extra quality
> > criteria for the applications wanting to follow it.
>
> I don't believe this is an option that's in line with our vision.
>
> > (4) We give up on the idea of regulating telemetry, rolling back on the
> > decision from Akademy.
>
> I don't think that's an acceptable option either because we need to
> get a better understanding of how our software is used in order to
> make it better.
>
> Jaroslav: I think you need to make a choice now for Kexi. We can't let
> it sit and hope it goes away ;-)
>

Thanks for reminding me Lydia

I've not forgotten this. While there's progress I do still see this as a
pilot stage and do not think we're in a hurry given telemetry is something
"extra" for a project development, not a core feature of any product.

Below I am referring to this version: https://community.
kde.org/index.php?title=Policies/Telemetry_Policy&oldid=78057

tl;dr: Why discussing: Any deep change and limitation to projects' freedom
needs to bring substantial benefits over drawbacks. Level of complexity of
the contract for a project or individual developer needs to be balanced by
real (not hypothetical) benefits.

I've studied the wiki page more in depth and I have these points where I'd
like to see improvement. This is based on my experience, not a list of
quick ideas.

https://community.kde.org/Talk:Policies/Telemetry_Policy#

That said: I will nod to the concept of "Minimalism", it is all classic
property of telemetry. I think I've seen them in other projects too.
I'd just say, let's not make all this more limited than anyone wants it to
be.

-- 
regards, Jaroslaw Staniek

KDE:
: A world-wide network of software engineers, artists, writers, translators
: and facilitators committed to Free Software development - http://kde.org
KEXI:
: A visual database apps builder - http://calligra.org/kexi
  http://twitter.com/kexi_project https://facebook.com/kexi.project
Qt Certified Specialist:
: http://www.linkedin.com/in/jstaniek

Re: Telemetry Policy - Remaining Questions

2018-04-01 Thread Lydia Pintscher

Hey folks :)

We really need to wrap this up now. We need the data and we need the
policy in place. It's holding us back from learning more about our
users and making our software better. That's not good.

On Sat, Nov 11, 2017 at 12:47 PM, Volker Krause  wrote:
> So, I see the following possible ways forward:
> (1) We accept the policy in its current spirit, and Kexi complies with it (if
> necessary after some transition period).

This would be the ideal way forward and the right thing to do.

> (2) We accept the policy in its current spirit, and Kexi is exempt from it.

As sebas said this is bad communication-wise. But if Kexi can't or
doesn't want to comply that's the next best option.

> (3) We make the policy opt-in, ie. using it merely as an extra quality
> criteria for the applications wanting to follow it.

I don't believe this is an option that's in line with our vision.

> (4) We give up on the idea of regulating telemetry, rolling back on the
> decision from Akademy.

I don't think that's an acceptable option either because we need to
get a better understanding of how our software is used in order to
make it better.

Jaroslav: I think you need to make a choice now for Kexi. We can't let
it sit and hope it goes away ;-)

Cheers
Lydia

-- 
Lydia Pintscher - http://about.me/lydia.pintscher
KDE e.V. Board of Directors
http://kde.org - http://open-advice.org

Re: Telemetry Policy - Remaining Questions

2017-11-11 Thread Volker Krause

On Tuesday, 31 October 2017 11:56:23 CET Sebastian Kügler wrote:
> On Tuesday, October 31, 2017 10:39:38 AM CET Volker Krause wrote:
> > On Monday, 30 October 2017 21:24:59 CET Albert Astals Cid wrote:
> > > El dilluns, 30 d’octubre de 2017, a les 9:56:52 CET, Volker Krause
> > > va
> > > 
> > > > Let's try to finally get this finished
> > > > 
> > > > The only remaining blocker is the unique identification used by
> > > > Kexi. There
> > > > was some discussion about this around QtWS, and it seemed like
> > > > there was consensus on having a strong policy on this topic would
> > > > be a good thing for
> > > > KDE, as opposed to e.g. turning this into just recommendations, or
> > > > opening
> > > > it up to unique identification. The suggested solution for Kexi
> > > > was to add
> > > > a special exception for it to the "These rules apply to all
> > > > products released by KDE." statement of the policy.
> > > 
> > > I'm confused, is that a workaround so that it doesn't apply to Kexi
> > > by implying Kexi isn't released by KDE?
> > 
> > That sounds a bit convoluted to me, I was more thinking about making
> > it a direct exception to the policy, e.g. like this:
> > 
> > "These rules apply to all products released by KDE (with the
> > exception of Kexi, which uses a telemetry system predating this
> > policy)."
> 
> This will make the communication downright awful, as people will
> concentrate on the exception, not the rule.

Quite possible, yes.

> I'm thinking along the lines of require code released by KDE to adopt
> the policy and even add it to the manifesto as requirement to make it
> easier to enforce. 

With respect to T7050 I agree with this, although we'd probably need a broader 
set of privacy-related policies for that, telemetry is just one building block 
there.

> Kexi can always make it opt-in, and could be given
> some time to do so before we officially adopt and require this
> telemetry policy.

It's not about opt-in vs. opt-out, the problem with the current policy draft 
is the unique identification used in Kexi. In my understanding that would turn 
the collected data into "personal data" in the legal sense (similar to e.g. IP 
addresses). The whole thing was designed to avoid that and the consequences it 
has, so relaxing the restriction on unique identification essentially results 
in a considerable change to the spirit of the current draft.

So, I see the following possible ways forward:
(1) We accept the policy in its current spirit, and Kexi complies with it (if 
necessary after some transition period).
(2) We accept the policy in its current spirit, and Kexi is exempt from it.
(3) We make the policy opt-in, ie. using it merely as an extra quality 
criteria for the applications wanting to follow it.
(4) We give up on the idea of regulating telemetry, rolling back on the 
decision from Akademy.

Obviously, I'd prefer one of the first two options. But this is now dragging on 
since Akademy and leads to the bizarre situation that I am only allowed to use 
some of my KDE code (KUserFeedback) in non-KDE applications but not in KDE 
applications I'm working on...

Regards,
Volker

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy - Remaining Questions

2017-10-31 Thread Jaroslaw Staniek

On 31 October 2017 at 11:56, Sebastian Kügler  wrote:
> On Tuesday, October 31, 2017 10:39:38 AM CET Volker Krause wrote:
>> On Monday, 30 October 2017 21:24:59 CET Albert Astals Cid wrote:
>> > El dilluns, 30 d’octubre de 2017, a les 9:56:52 CET, Volker Krause
>> > va
>> > > Let's try to finally get this finished
>> > >
>> > > The only remaining blocker is the unique identification used by
>> > > Kexi. There
>> > > was some discussion about this around QtWS, and it seemed like
>> > > there was consensus on having a strong policy on this topic would
>> > > be a good thing for
>> > > KDE, as opposed to e.g. turning this into just recommendations, or
>> > > opening
>> > > it up to unique identification. The suggested solution for Kexi
>> > > was to add
>> > > a special exception for it to the "These rules apply to all
>> > > products released by KDE." statement of the policy.
>>
>> > I'm confused, is that a workaround so that it doesn't apply to Kexi
>> > by implying Kexi isn't released by KDE?
>>
>> That sounds a bit convoluted to me, I was more thinking about making
>> it a direct exception to the policy, e.g. like this:
>>
>> "These rules apply to all products released by KDE (with the
>> exception of Kexi, which uses a telemetry system predating this
>> policy)."
>
> This will make the communication downright awful, as people will
> concentrate on the exception, not the rule.
>

I am sure energy would be concentrated on exception and
nonconstructive activities (from 3rd parties?) because... please read
below:

> I'm thinking along the lines of require code released by KDE to adopt
> the policy and even add it to the manifesto as requirement to make it
> easier to enforce. Kexi can always make it opt-in, and could be given
> some time to do so before we officially adopt and require this
> telemetry policy.
> Jaroslaw, would that work for you?

because... one thing is apparently missed even in this internal thread:
IIRC Kexi apps have never offered opt-out policy even for anonymous telemetry.
I blogged about that as soon as the feature landed [1].
Users pick level of involvement, zero by default, and telemetry is
presented as a way
for involvement in the project not as a threat.

[1] https://blogs.kde.org/2013/12/09/usage-stats

-- 
regards, Jaroslaw Staniek

KDE:
: A world-wide network of software engineers, artists, writers, translators
: and facilitators committed to Free Software development - http://kde.org
Calligra Suite:
: A graphic art and office suite - http://calligra.org
Kexi:
: A visual database apps builder - http://calligra.org/kexi
Qt Certified Specialist:
: http://www.linkedin.com/in/jstaniek

Re: Telemetry Policy - Remaining Questions

2017-10-31 Thread Sebastian Kügler

On Tuesday, October 31, 2017 10:39:38 AM CET Volker Krause wrote:
> On Monday, 30 October 2017 21:24:59 CET Albert Astals Cid wrote:
> > El dilluns, 30 d’octubre de 2017, a les 9:56:52 CET, Volker Krause
> > va
> > > Let's try to finally get this finished 
> > > 
> > > The only remaining blocker is the unique identification used by
> > > Kexi. There
> > > was some discussion about this around QtWS, and it seemed like
> > > there was consensus on having a strong policy on this topic would
> > > be a good thing for
> > > KDE, as opposed to e.g. turning this into just recommendations, or
> > > opening
> > > it up to unique identification. The suggested solution for Kexi
> > > was to add
> > > a special exception for it to the "These rules apply to all
> > > products released by KDE." statement of the policy.
> 
> > I'm confused, is that a workaround so that it doesn't apply to Kexi
> > by implying Kexi isn't released by KDE?
> 
> That sounds a bit convoluted to me, I was more thinking about making
> it a direct exception to the policy, e.g. like this:
> 
> "These rules apply to all products released by KDE (with the
> exception of Kexi, which uses a telemetry system predating this
> policy)."

This will make the communication downright awful, as people will
concentrate on the exception, not the rule.

I'm thinking along the lines of require code released by KDE to adopt
the policy and even add it to the manifesto as requirement to make it
easier to enforce. Kexi can always make it opt-in, and could be given
some time to do so before we officially adopt and require this
telemetry policy.

Jaroslaw, would that work for you?
-- 
sebas

http://www.kde.org | http://vizZzion.org

Re: Telemetry Policy - Remaining Questions

2017-10-31 Thread Volker Krause

On Monday, 30 October 2017 11:27:58 CET Jaroslaw Staniek wrote:
> On 30 October 2017 at 09:56, Volker Krause  wrote:
> > Let's try to finally get this finished :)
> > 
> > The only remaining blocker is the unique identification used by Kexi.
> > There was some discussion about this around QtWS, and it seemed like
> > there was consensus on having a strong policy on this topic would be a
> > good thing for KDE, as opposed to e.g. turning this into just
> > recommendations, or opening it up to unique identification. The suggested
> > solution for Kexi was to add a special exception for it to the "These
> > rules apply to all products released by KDE." statement of the policy.
> > 
> > That would still leave us with a strong policy on this subject, while
> > solving the conflict with Kexi's current way of collecting telemetry.
> > Would that work for everyone?
> 
> Hello
> Thanks for pushing this forward Volker.
> In the meantime I got an inspired idea to behave no different than KDE
> web browsers do with unique cookies e.g. wrt the KDE Identity
> accounts.
> Namely there would be zero logic for IDs in Kexi itself but a cookie
> feature with its standard behavior. As it's the case, it's opt-in.
> For now I hope this is technically feasible and the result equivalent
> of the previous solution if not even more flexible.
> I would appreciate pointing flaws in my assumption. Timeline for that
> can be connected to development of sign-in features.
> 
> Unless there is desire to discuss exceptions for a range of KDE
> software that implements client side for web technologies maybe there
> is no need for adding specific exception for Kexi or having it
> communicated by Kexi itself.

We can obviously drop the exception as soon as it's no longer necessary. I'm 
not sure I fully understand the proposed implementation, but I do see local 
storage (via cookies or otherwise) as a way to address some of the unique 
identification use-cases, such as aggregating data from the same user.

The comparison to KDE Identity is a bit confusing though, as the objective of 
that is unique identification for authorizing access, which inherently 
conflicts 
with anonymity.

> I'd like to also mention apparent lack of clarity for the outside user
> wrt what "products released by KDE" mean. What are the defaults in
> deployed software is a decision of those who deploy the software;
> legal modifications are allright. KDE "only" releases the source code.
> So I would not place such a stamp "These rules apply to all products
> released by KDE" e.g. in About boxes because this has low info value
> for the actual user or can truly confuse.
> I am mentioning this here to emphasize that I see telemetry more as a
> part of the software deployment and support, not a part of the actual
> "source code product". Decoupling any logic from the source code is
> part of that.

Right, we cannot ultimately enforce this due to the free software licenses. 
While we can only ask external distributors to follow the same rules, we are 
also a distributor in a number of cases (Windows, Android, Linux app bundles, 
Neon, etc), and can apply the rules there too.

I agree that there is a communication/marketing aspect to this as well, and a 
policy document is not the right tool for that, especially when things get 
complicated.

Regards,
Volker

> > On Thursday, 14 September 2017 00:20:57 CET Volker Krause wrote:
> >> Hi,
> >> 
> >> as not everyone follows long threads, let's start again for the remaining
> >> issues.
> >> 
> >> https://community.kde.org/Policies/Telemetry_Policy
> >> 
> >> The following questions were left unanswered in the previous thread (see
> >> there for the full arguments if needed):
> >> 
> >> (1) Should we allow opt-in tracking of unique identifiers?
> >> 
> >> This was requested by Jaroslaw, as Kexi has this right now and the policy
> >> as written right now would thus conflict with it.
> >> 
> >> (2) Should we require/allow/forbid publication of the raw data?
> >> 
> >> Publication was suggested by Martin F. Practically, this would have to
> >> allow for a certain delay, we can't have public access to live data.
> >> Suitable licensing options of the data would probably be CC0 or
> >> CC-BY-SA.
> >> 
> >> (3) Should we require a revocation feature?
> >> 
> >> That is, allow the user to "delete" the data they submitted from the
> >> server. This was also suggested by Martin F, and is technically possible
> >> without compromising anonymity.
> >> 
> >> (4) Should we define limits on how long we store the raw data?
> >> 
> >> Brought up by Bhushan.
> >> 
> >> (5) Should we require an "audit log" feature?
> >> 
> >> Thas is, allow the user to see a detailed record of what has been
> >> submitted
> >> so far? Martin S suggested this (and it has been meanwhile implemented in
> >> KUserFeedback).
> >> 
> >> Not from the previous thread, but from a discussion in Randa:
> >> (6) What is the "lower bound" of where we consider this policy to ap

Re: Telemetry Policy - Remaining Questions

2017-10-31 Thread Volker Krause

On Monday, 30 October 2017 21:24:59 CET Albert Astals Cid wrote:
> El dilluns, 30 d’octubre de 2017, a les 9:56:52 CET, Volker Krause va
> 
> escriure:
> > Let's try to finally get this finished :)
> > 
> > The only remaining blocker is the unique identification used by Kexi.
> > There
> > was some discussion about this around QtWS, and it seemed like there was
> > consensus on having a strong policy on this topic would be a good thing
> > for
> > KDE, as opposed to e.g. turning this into just recommendations, or opening
> > it up to unique identification. The suggested solution for Kexi was to add
> > a special exception for it to the "These rules apply to all products
> > released by KDE." statement of the policy.
> 
> I'm confused, is that a workaround so that it doesn't apply to Kexi by
> implying Kexi isn't released by KDE?

That sounds a bit convoluted to me, I was more thinking about making it a 
direct exception to the policy, e.g. like this:

"These rules apply to all products released by KDE (with the exception of 
Kexi, which uses a telemetry system predating this policy)."

Regards,
Volker

> > That would still leave us with a strong policy on this subject, while
> > solving the conflict with Kexi's current way of collecting telemetry.
> > Would
> > that work for everyone?
> > 
> > Regards,
> > Volker
> > 
> > On Thursday, 14 September 2017 00:20:57 CET Volker Krause wrote:
> > > Hi,
> > > 
> > > as not everyone follows long threads, let's start again for the
> > > remaining
> > > issues.
> > > 
> > > https://community.kde.org/Policies/Telemetry_Policy
> > > 
> > > The following questions were left unanswered in the previous thread (see
> > > there for the full arguments if needed):
> > > 
> > > (1) Should we allow opt-in tracking of unique identifiers?
> > > 
> > > This was requested by Jaroslaw, as Kexi has this right now and the
> > > policy
> > > as written right now would thus conflict with it.
> > > 
> > > (2) Should we require/allow/forbid publication of the raw data?
> > > 
> > > Publication was suggested by Martin F. Practically, this would have to
> > > allow for a certain delay, we can't have public access to live data.
> > > Suitable licensing options of the data would probably be CC0 or
> > > CC-BY-SA.
> > > 
> > > (3) Should we require a revocation feature?
> > > 
> > > That is, allow the user to "delete" the data they submitted from the
> > > server. This was also suggested by Martin F, and is technically possible
> > > without compromising anonymity.
> > > 
> > > (4) Should we define limits on how long we store the raw data?
> > > 
> > > Brought up by Bhushan.
> > > 
> > > (5) Should we require an "audit log" feature?
> > > 
> > > Thas is, allow the user to see a detailed record of what has been
> > > submitted
> > > so far? Martin S suggested this (and it has been meanwhile implemented
> > > in
> > > KUserFeedback).
> > > 
> > > Not from the previous thread, but from a discussion in Randa:
> > > (6) What is the "lower bound" of where we consider this policy to apply?
> > > 
> > > That is, does checking for application updates/news (and possibly
> > > tracking
> > > that on the server) already count as "telemetry" in this context? See
> > > e.g.
> > > the current practice in Akregator or KDevelop.
> > > 
> > > 
> > > Allowing (1) might conflict with (2) allowing publication, unique
> > > identification brings us in personal data territory. Publication might
> > > also
> > > conflict with (3) and (4).
> > > 
> > > So, what's your view on those issues? :)
> > > 
> > > Thanks!
> > > Volker



signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy - Remaining Questions

2017-10-30 Thread Albert Astals Cid

El dilluns, 30 d’octubre de 2017, a les 9:56:52 CET, Volker Krause va 
escriure:
> Let's try to finally get this finished :)
> 
> The only remaining blocker is the unique identification used by Kexi. There
> was some discussion about this around QtWS, and it seemed like there was
> consensus on having a strong policy on this topic would be a good thing for
> KDE, as opposed to e.g. turning this into just recommendations, or opening
> it up to unique identification. The suggested solution for Kexi was to add
> a special exception for it to the "These rules apply to all products
> released by KDE." statement of the policy.

I'm confused, is that a workaround so that it doesn't apply to Kexi by 
implying Kexi isn't released by KDE? 

Cheers,
  Albert

> 
> That would still leave us with a strong policy on this subject, while
> solving the conflict with Kexi's current way of collecting telemetry. Would
> that work for everyone?
> 
> Regards,
> Volker
> 
> On Thursday, 14 September 2017 00:20:57 CET Volker Krause wrote:
> > Hi,
> > 
> > as not everyone follows long threads, let's start again for the remaining
> > issues.
> > 
> > https://community.kde.org/Policies/Telemetry_Policy
> > 
> > The following questions were left unanswered in the previous thread (see
> > there for the full arguments if needed):
> > 
> > (1) Should we allow opt-in tracking of unique identifiers?
> > 
> > This was requested by Jaroslaw, as Kexi has this right now and the policy
> > as written right now would thus conflict with it.
> > 
> > (2) Should we require/allow/forbid publication of the raw data?
> > 
> > Publication was suggested by Martin F. Practically, this would have to
> > allow for a certain delay, we can't have public access to live data.
> > Suitable licensing options of the data would probably be CC0 or CC-BY-SA.
> > 
> > (3) Should we require a revocation feature?
> > 
> > That is, allow the user to "delete" the data they submitted from the
> > server. This was also suggested by Martin F, and is technically possible
> > without compromising anonymity.
> > 
> > (4) Should we define limits on how long we store the raw data?
> > 
> > Brought up by Bhushan.
> > 
> > (5) Should we require an "audit log" feature?
> > 
> > Thas is, allow the user to see a detailed record of what has been
> > submitted
> > so far? Martin S suggested this (and it has been meanwhile implemented in
> > KUserFeedback).
> > 
> > Not from the previous thread, but from a discussion in Randa:
> > (6) What is the "lower bound" of where we consider this policy to apply?
> > 
> > That is, does checking for application updates/news (and possibly tracking
> > that on the server) already count as "telemetry" in this context? See e.g.
> > the current practice in Akregator or KDevelop.
> > 
> > 
> > Allowing (1) might conflict with (2) allowing publication, unique
> > identification brings us in personal data territory. Publication might
> > also
> > conflict with (3) and (4).
> > 
> > So, what's your view on those issues? :)
> > 
> > Thanks!
> > Volker

Re: Telemetry Policy - Remaining Questions

2017-10-30 Thread Jaroslaw Staniek

On 30 October 2017 at 09:56, Volker Krause  wrote:
> Let's try to finally get this finished :)
>
> The only remaining blocker is the unique identification used by Kexi. There 
> was
> some discussion about this around QtWS, and it seemed like there was consensus
> on having a strong policy on this topic would be a good thing for KDE, as
> opposed to e.g. turning this into just recommendations, or opening it up to
> unique identification. The suggested solution for Kexi was to add a special
> exception for it to the "These rules apply to all products released by KDE."
> statement of the policy.
>
> That would still leave us with a strong policy on this subject, while solving
> the conflict with Kexi's current way of collecting telemetry. Would that work
> for everyone?

Hello
Thanks for pushing this forward Volker.
In the meantime I got an inspired idea to behave no different than KDE
web browsers do with unique cookies e.g. wrt the KDE Identity
accounts.
Namely there would be zero logic for IDs in Kexi itself but a cookie
feature with its standard behavior. As it's the case, it's opt-in.
For now I hope this is technically feasible and the result equivalent
of the previous solution if not even more flexible.
I would appreciate pointing flaws in my assumption. Timeline for that
can be connected to development of sign-in features.

Unless there is desire to discuss exceptions for a range of KDE
software that implements client side for web technologies maybe there
is no need for adding specific exception for Kexi or having it
communicated by Kexi itself.

I'd like to also mention apparent lack of clarity for the outside user
wrt what "products released by KDE" mean. What are the defaults in
deployed software is a decision of those who deploy the software;
legal modifications are allright. KDE "only" releases the source code.
So I would not place such a stamp "These rules apply to all products
released by KDE" e.g. in About boxes because this has low info value
for the actual user or can truly confuse.
I am mentioning this here to emphasize that I see telemetry more as a
part of the software deployment and support, not a part of the actual
"source code product". Decoupling any logic from the source code is
part of that.

> On Thursday, 14 September 2017 00:20:57 CET Volker Krause wrote:
>> Hi,
>>
>> as not everyone follows long threads, let's start again for the remaining
>> issues.
>>
>> https://community.kde.org/Policies/Telemetry_Policy
>>
>> The following questions were left unanswered in the previous thread (see
>> there for the full arguments if needed):
>>
>> (1) Should we allow opt-in tracking of unique identifiers?
>>
>> This was requested by Jaroslaw, as Kexi has this right now and the policy as
>> written right now would thus conflict with it.
>>
>> (2) Should we require/allow/forbid publication of the raw data?
>>
>> Publication was suggested by Martin F. Practically, this would have to allow
>> for a certain delay, we can't have public access to live data. Suitable
>> licensing options of the data would probably be CC0 or CC-BY-SA.
>>
>> (3) Should we require a revocation feature?
>>
>> That is, allow the user to "delete" the data they submitted from the server.
>> This was also suggested by Martin F, and is technically possible without
>> compromising anonymity.
>>
>> (4) Should we define limits on how long we store the raw data?
>>
>> Brought up by Bhushan.
>>
>> (5) Should we require an "audit log" feature?
>>
>> Thas is, allow the user to see a detailed record of what has been submitted
>> so far? Martin S suggested this (and it has been meanwhile implemented in
>> KUserFeedback).
>>
>> Not from the previous thread, but from a discussion in Randa:
>> (6) What is the "lower bound" of where we consider this policy to apply?
>>
>> That is, does checking for application updates/news (and possibly tracking
>> that on the server) already count as "telemetry" in this context? See e.g.
>> the current practice in Akregator or KDevelop.
>>
>>
>> Allowing (1) might conflict with (2) allowing publication, unique
>> identification brings us in personal data territory. Publication might also
>> conflict with (3) and (4).
>>
>> So, what's your view on those issues? :)
>>
>> Thanks!
>> Volker
>

-- 
regards, Jaroslaw Staniek

KDE:
: A world-wide network of software engineers, artists, writers, translators
: and facilitators committed to Free Software development - http://kde.org
Calligra Suite:
: A graphic art and office suite - http://calligra.org
Kexi:
: A visual database apps builder - http://calligra.org/kexi
Qt Certified Specialist:
: http://www.linkedin.com/in/jstaniek

Re: Telemetry Policy - Remaining Questions

2017-10-30 Thread Volker Krause

Let's try to finally get this finished :)

The only remaining blocker is the unique identification used by Kexi. There was 
some discussion about this around QtWS, and it seemed like there was consensus 
on having a strong policy on this topic would be a good thing for KDE, as 
opposed to e.g. turning this into just recommendations, or opening it up to 
unique identification. The suggested solution for Kexi was to add a special 
exception for it to the "These rules apply to all products released by KDE." 
statement of the policy.

That would still leave us with a strong policy on this subject, while solving 
the conflict with Kexi's current way of collecting telemetry. Would that work 
for everyone?

Regards,
Volker

On Thursday, 14 September 2017 00:20:57 CET Volker Krause wrote:
> Hi,
> 
> as not everyone follows long threads, let's start again for the remaining
> issues.
> 
> https://community.kde.org/Policies/Telemetry_Policy
> 
> The following questions were left unanswered in the previous thread (see
> there for the full arguments if needed):
> 
> (1) Should we allow opt-in tracking of unique identifiers?
> 
> This was requested by Jaroslaw, as Kexi has this right now and the policy as
> written right now would thus conflict with it.
> 
> (2) Should we require/allow/forbid publication of the raw data?
> 
> Publication was suggested by Martin F. Practically, this would have to allow
> for a certain delay, we can't have public access to live data. Suitable
> licensing options of the data would probably be CC0 or CC-BY-SA.
> 
> (3) Should we require a revocation feature?
> 
> That is, allow the user to "delete" the data they submitted from the server.
> This was also suggested by Martin F, and is technically possible without
> compromising anonymity.
> 
> (4) Should we define limits on how long we store the raw data?
> 
> Brought up by Bhushan.
> 
> (5) Should we require an "audit log" feature?
> 
> Thas is, allow the user to see a detailed record of what has been submitted
> so far? Martin S suggested this (and it has been meanwhile implemented in
> KUserFeedback).
> 
> Not from the previous thread, but from a discussion in Randa:
> (6) What is the "lower bound" of where we consider this policy to apply?
> 
> That is, does checking for application updates/news (and possibly tracking
> that on the server) already count as "telemetry" in this context? See e.g.
> the current practice in Akregator or KDevelop.
> 
> 
> Allowing (1) might conflict with (2) allowing publication, unique
> identification brings us in personal data territory. Publication might also
> conflict with (3) and (4).
> 
> So, what's your view on those issues? :)
> 
> Thanks!
> Volker



signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy - Remaining Questions

2017-09-16 Thread Volker Krause

On Saturday, 16 September 2017 07:54:48 CEST Nicolás Alvarez wrote:
> 2017-09-15 4:27 GMT-03:00 Volker Krause :
> > On Friday, 15 September 2017 05:23:44 CEST Nicolás Alvarez wrote:
> >> From Mozilla documentation: "So when you say '63% of beta 53 has
> >> Firefox set as its default browser', make sure you specify it is 63%
> >> of *pings*, since it is only around 46% of clients. (Apparently users
> >> with Firefox Beta 53 set as their default browser submit more
> >> main-pings than users who don't)."
> > 
> > That is something else though, that's the participation ratio on opt-in.
> > Measuring that and determining its bias on the submitted data is indeed a
> > challenge, but I don't see how unique ids help with that?
> 
> It has nothing to do with participation ratio; Firefox betas have
> opt-out telemetry.
> 
> 63% of Firefox Beta 53 telemetry records (which Mozilla calls "pings")
> say it was set as the default browser. If you deduplicate using client
> ID, it turns out 46% of distinct Firefox Beta 53 clients had it set as
> the default browser. Thus, for some reason users who set it as the
> default browser send a more frequent telemetry than those who didn't,
> perhaps because they use the browser more.

Ah, sorry, I misunderstood this then. Deduplication in should work in this 
case with the time-based approach too I think.

Regards,
Volker

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy - Remaining Questions

2017-09-15 Thread Nicolás Alvarez

2017-09-15 4:27 GMT-03:00 Volker Krause :
> On Friday, 15 September 2017 05:23:44 CEST Nicolás Alvarez wrote:
>> From Mozilla documentation: "So when you say '63% of beta 53 has
>> Firefox set as its default browser', make sure you specify it is 63%
>> of *pings*, since it is only around 46% of clients. (Apparently users
>> with Firefox Beta 53 set as their default browser submit more
>> main-pings than users who don't)."
>
> That is something else though, that's the participation ratio on opt-in.
> Measuring that and determining its bias on the submitted data is indeed a
> challenge, but I don't see how unique ids help with that?

It has nothing to do with participation ratio; Firefox betas have
opt-out telemetry.

63% of Firefox Beta 53 telemetry records (which Mozilla calls "pings")
say it was set as the default browser. If you deduplicate using client
ID, it turns out 46% of distinct Firefox Beta 53 clients had it set as
the default browser. Thus, for some reason users who set it as the
default browser send a more frequent telemetry than those who didn't,
perhaps because they use the browser more.

-- 
Nicolás

Re: Telemetry Policy - Remaining Questions

2017-09-15 Thread Volker Krause

On Friday, 15 September 2017 05:23:44 CEST Nicolás Alvarez wrote:
> 2017-09-14 15:56 GMT-03:00 Albert Astals Cid :
> > El dijous, 14 de setembre de 2017, a les 0:20:57 CEST, Volker Krause va
> > 
> > escriure:
> >> The following questions were left unanswered in the previous thread (see
> >> there for the full arguments if needed):
> >> 
> >> (1) Should we allow opt-in tracking of unique identifiers?
> >> 
> >> This was requested by Jaroslaw, as Kexi has this right now and the policy
> >> as written right now would thus conflict with it.
> > 
> > I missed this, what's the usecase of unique id data?
> 
> Without a unique ID, each time the app sends telemetry, the record is
> independent and not correlated to previous records. Generating a
> random "client ID" and persisting it in some file in $HOME, and
> including it in the uploaded data, lets you calculate statistics per
> client, which is more useful than per telemetry record.
> 
> It's hard to know how what percentage of users users have a setting
> enabled if we don't have a client ID, since some users may send more
> telemetry reports than others (for multiple reasons, including using
> the app more often). If we have one, we can avoid double-counting
> multiple reports from the same client.

That is true is you send telemetry per application start. That's not the only 
way to do it though. Quoting myself from the earlier thread:

> The implementation in KUserFeedback addresses this by fixed interval data
> submission. If you then aggregate the received data by the same interval,
> you can see e.g. how ratios of application versions develop over time.
> 
> This does have limits of course, you can't distinguish between the same
> person using the application every sampling interval, or two people using
> it every other interval for example. With a sufficiently long sampling
> interval the result should nevertheless be sufficiently accurate I think.

and in a bit more detail here: https://mail.kde.org/pipermail/kde-community/
2017q3/003917.html

Ie. I think we can do the de-duplication by other means than unique 
identification.
 
> From Mozilla documentation: "So when you say '63% of beta 53 has
> Firefox set as its default browser', make sure you specify it is 63%
> of *pings*, since it is only around 46% of clients. (Apparently users
> with Firefox Beta 53 set as their default browser submit more
> main-pings than users who don't)."

That is something else though, that's the participation ratio on opt-in. 
Measuring that and determining its bias on the submitted data is indeed a 
challenge, but I don't see how unique ids help with that?

Regards,
Volker

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy - Remaining Questions

2017-09-14 Thread Nicolás Alvarez

2017-09-14 15:56 GMT-03:00 Albert Astals Cid :
> El dijous, 14 de setembre de 2017, a les 0:20:57 CEST, Volker Krause va
> escriure:
>> The following questions were left unanswered in the previous thread (see
>> there for the full arguments if needed):
>>
>> (1) Should we allow opt-in tracking of unique identifiers?
>>
>> This was requested by Jaroslaw, as Kexi has this right now and the policy as
>> written right now would thus conflict with it.
>
> I missed this, what's the usecase of unique id data?

Without a unique ID, each time the app sends telemetry, the record is
independent and not correlated to previous records. Generating a
random "client ID" and persisting it in some file in $HOME, and
including it in the uploaded data, lets you calculate statistics per
client, which is more useful than per telemetry record.

It's hard to know how what percentage of users users have a setting
enabled if we don't have a client ID, since some users may send more
telemetry reports than others (for multiple reasons, including using
the app more often). If we have one, we can avoid double-counting
multiple reports from the same client.

>From Mozilla documentation: "So when you say '63% of beta 53 has
Firefox set as its default browser', make sure you specify it is 63%
of *pings*, since it is only around 46% of clients. (Apparently users
with Firefox Beta 53 set as their default browser submit more
main-pings than users who don't)."

-- 
Nicolás

Re: Telemetry Policy - Remaining Questions

2017-09-14 Thread Volker Krause

On Thursday, 14 September 2017 20:56:49 CEST Albert Astals Cid wrote:
> El dijous, 14 de setembre de 2017, a les 0:20:57 CEST, Volker Krause va 
> escriure:
> > (1) Should we allow opt-in tracking of unique identifiers?
> > 
> > This was requested by Jaroslaw, as Kexi has this right now and the policy
> > as written right now would thus conflict with it.
> 
> I missed this, what's the usecase of unique id data?

IIUC Kexi submits telemetry on each startup (opt-in, of course). That does not 
allow you to distinguish between one user starting the application multiple 
times or many users starting it once (which has a lot of implications on how 
to interpret/weight the data). Look for Jaroslaw's contributions to the 
previous thread for the full picture.

Regards,
Volker

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy - Remaining Questions

2017-09-14 Thread Albert Astals Cid

El dijous, 14 de setembre de 2017, a les 0:20:57 CEST, Volker Krause va 
escriure:
> Hi,
> 
> as not everyone follows long threads, let's start again for the remaining
> issues.
> 
> https://community.kde.org/Policies/Telemetry_Policy
> 
> The following questions were left unanswered in the previous thread (see
> there for the full arguments if needed):
> 
> (1) Should we allow opt-in tracking of unique identifiers?
> 
> This was requested by Jaroslaw, as Kexi has this right now and the policy as
> written right now would thus conflict with it.

I missed this, what's the usecase of unique id data?

> 
> (2) Should we require/allow/forbid publication of the raw data?
> 
> Publication was suggested by Martin F. Practically, this would have to allow
> for a certain delay, we can't have public access to live data. Suitable
> licensing options of the data would probably be CC0 or CC-BY-SA.

I'd say forbid, making a read/write system is much more complex than what we 
want to do.

> 
> (3) Should we require a revocation feature?
> 
> That is, allow the user to "delete" the data they submitted from the server.
> This was also suggested by Martin F, and is technically possible without
> compromising anonymity.

It'd be nice, but I see that as maybe a second step if we can do it nicely 
(which calls to make sure we have proper versioning in place so we can update 
stuff)

> 
> (4) Should we define limits on how long we store the raw data?
> 
> Brought up by Bhushan.

If it doesn't identify anyone, i don't see the problem of having raw data 
(other than raw disk space).

> 
> (5) Should we require an "audit log" feature?
> 
> Thas is, allow the user to see a detailed record of what has been submitted
> so far? Martin S suggested this (and it has been meanwhile implemented in
> KUserFeedback).

I'd say so yes, it's important to be transparent about what we submit.

> 
> Not from the previous thread, but from a discussion in Randa:
> (6) What is the "lower bound" of where we consider this policy to apply?
> 
> That is, does checking for application updates/news (and possibly tracking
> that on the server) already count as "telemetry" in this context? See e.g.
> the current practice in Akregator or KDevelop.

I think lower bound is "we store things with the aim to use it", when you 
check for updates, you can or can not store things (and no a apache log 
doesn't really count as "storing with the aim to use it"), if we do, it's 
telemetry, if we don't it's not.

Cheers,
  Albert

> 
> 
> Allowing (1) might conflict with (2) allowing publication, unique
> identification brings us in personal data territory. Publication might also
> conflict with (3) and (4).
> 
> So, what's your view on those issues? :)
> 
> Thanks!
> Volker

Re: Telemetry Policy - Remaining Questions

2017-09-14 Thread Volker Krause

On Thursday, 14 September 2017 09:02:07 CEST Adriaan de Groot wrote:
> On Wednesday, September 13, 2017 6:20:57 PM EDT Volker Krause wrote:
> > as not everyone follows long threads, let's start again for the remaining
> > issues.
> 
> Thanks for pulling the threads back together again.
> 
> > (1) Should we allow opt-in tracking of unique identifiers?
> > 
> > This was requested by Jaroslaw, as Kexi has this right now and the policy
> > as written right now would thus conflict with it.
> 
> I think it's acceptable if it is (a) opt-in (b) the wording is sufficiently
> clear (c) no functionality is dependent on it.

This would essentially mean removing the Anonymity section from the current 
policy draft though.

> > (2) Should we require/allow/forbid publication of the raw data?
> 
> Forbidding is easier on the policy and on the admins. Forbidding means we do
> not have to a priori figure out what can be published and arm ourselves
> against de-anonymisation attacks. Also, I think this could be changed
> later: *if* there is no PII in the data, *and* we can publish in a sensible
> fashion, then there's nothing for individuals to object to.

I'd be fine with not regulating this aspect for now, until we actually have 
some data to base the decision on.

> > (3) Should we require a revocation feature?
> 
> What is the narrative here? (I guess I could go back to the other thread to
> look)

It's mainly about the control aspect. I.e. giving the user control to delete 
submitted data again if they change their mind.

Nice to have, but IMHO not something that would need to be mandatory, 
especially since it gets hard to communicate in combination with publishing.

> > (4) Should we define limits on how long we store the raw data?
> 
> Yes. Something dependent on how often we publish (even if just publishing
> internal collations of data) the aggregated data.
> 
> > (5) Should we require an "audit log" feature?
> > 
> > Thas is, allow the user to see a detailed record of what has been
> > submitted
> > so far? Martin S suggested this (and it has been meanwhile implemented in
> > KUserFeedback).
> 
> Is that based on what the client knows, or what the server knows? This ties
> in to (3), above.

The current implementation is the client view. A server view could be 
implemented like (3) indeed, without compromising anonymity.

Regards,
Volker


signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy - Remaining Questions

2017-09-14 Thread Nicolás Alvarez

2017-09-13 19:20 GMT-03:00 Volker Krause :
> (1) Should we allow opt-in tracking of unique identifiers?
>
> This was requested by Jaroslaw, as Kexi has this right now and the policy as
> written right now would thus conflict with it.

My view is that we need opt-out telemetry with unique identifiers to
get useful conclusions out, but from what I saw in the big thread I
know this is an unpopular opinion.

> (2) Should we require/allow/forbid publication of the raw data?
>
> Publication was suggested by Martin F. Practically, this would have to allow
> for a certain delay, we can't have public access to live data. Suitable
> licensing options of the data would probably be CC0 or CC-BY-SA.

FWIW, Mozilla does not publish raw data. Mozilla employees can run
custom analyses on raw data (for cases where the public aggregated
data isn't enough), but I think even them can't just look at
individual records.

(Then again, Mozilla raw data contains unique identifiers, which can
make their privacy requirements stricter)

I also wouldn't want to "require" publication of raw data for
practical/technical reasons: until we implement some minimal
pseudo-telemetry pings, we don't even know how many active users we
have, so we can't estimate what's the total volume of raw data we will
end up collecting. It may end up needing extra server resources to
store and aggregate. I wouldn't want to also make said giant raw data
publicly available. Maybe it's technically possible, but I don't want
to already require/promise that we will do it.

-- 
Nicolás

Re: Telemetry Policy - Remaining Questions

2017-09-14 Thread Adriaan de Groot

On Wednesday, September 13, 2017 6:20:57 PM EDT Volker Krause wrote:
> as not everyone follows long threads, let's start again for the remaining
> issues.

Thanks for pulling the threads back together again.

> (1) Should we allow opt-in tracking of unique identifiers?
> 
> This was requested by Jaroslaw, as Kexi has this right now and the policy as
> written right now would thus conflict with it.

I think it's acceptable if it is (a) opt-in (b) the wording is sufficiently 
clear (c) no functionality is dependent on it.

> (2) Should we require/allow/forbid publication of the raw data?

Forbidding is easier on the policy and on the admins. Forbidding means we do 
not have to a priori figure out what can be published and arm ourselves against 
de-anonymisation attacks. Also, I think this could be changed later: *if* 
there is no PII in the data, *and* we can publish in a sensible fashion, then 
there's nothing for individuals to object to.

> (3) Should we require a revocation feature?

What is the narrative here? (I guess I could go back to the other thread to 
look) 

> (4) Should we define limits on how long we store the raw data?

Yes. Something dependent on how often we publish (even if just publishing 
internal collations of data) the aggregated data.

> (5) Should we require an "audit log" feature?
> 
> Thas is, allow the user to see a detailed record of what has been submitted
> so far? Martin S suggested this (and it has been meanwhile implemented in
> KUserFeedback).

Is that based on what the client knows, or what the server knows? This ties in 
to (3), above.

> Not from the previous thread, but from a discussion in Randa:
> (6) What is the "lower bound" of where we consider this policy to apply?
> 
> That is, does checking for application updates/news (and possibly tracking
> that on the server) already count as "telemetry" in this context? See e.g.
> the current practice in Akregator or KDevelop.

There's two sides to this: application intent, and what is done on the server-
side. Checking for updates has no telemetry attempt, unless there's something 
arranged server-side.

[ade]

signature.asc
Description: This is a digitally signed message part.

Telemetry Policy - Remaining Questions

2017-09-13 Thread Volker Krause

Hi,

as not everyone follows long threads, let's start again for the remaining 
issues.

https://community.kde.org/Policies/Telemetry_Policy

The following questions were left unanswered in the previous thread (see there 
for the full arguments if needed):

(1) Should we allow opt-in tracking of unique identifiers?

This was requested by Jaroslaw, as Kexi has this right now and the policy as 
written right now would thus conflict with it.

(2) Should we require/allow/forbid publication of the raw data?

Publication was suggested by Martin F. Practically, this would have to allow 
for a certain delay, we can't have public access to live data. Suitable 
licensing options of the data would probably be CC0 or CC-BY-SA.

(3) Should we require a revocation feature?

That is, allow the user to "delete" the data they submitted from the server. 
This was also suggested by Martin F, and is technically possible without 
compromising anonymity.

(4) Should we define limits on how long we store the raw data?

Brought up by Bhushan.

(5) Should we require an "audit log" feature?

Thas is, allow the user to see a detailed record of what has been submitted so 
far? Martin S suggested this (and it has been meanwhile implemented in 
KUserFeedback).

Not from the previous thread, but from a discussion in Randa:
(6) What is the "lower bound" of where we consider this policy to apply?

That is, does checking for application updates/news (and possibly tracking 
that on the server) already count as "telemetry" in this context? See e.g. the 
current practice in Akregator or KDevelop.


Allowing (1) might conflict with (2) allowing publication, unique 
identification 
brings us in personal data territory. Publication might also conflict with (3) 
and (4).

So, what's your view on those issues? :)

Thanks!
Volker


signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-28 Thread Volker Krause

On Friday, 25 August 2017 10:11:09 CEST Boudewijn Rempt wrote:
> On Sun, 13 Aug 2017, Volker Krause wrote:
> > # Telemetry Policy Draft
> 
> Do we already have a wiki page I can link to? I want to publish
> Alexey's experimental build today, and that needs to be announced
> pretty carefully.

Yep: https://community.kde.org/Policies/Telemetry_Policy

Regards,
Volker


signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-25 Thread Boudewijn Rempt

On Sun, 13 Aug 2017, Volker Krause wrote:

> # Telemetry Policy Draft

Do we already have a wiki page I can link to? I want to publish
Alexey's experimental build today, and that needs to be announced
pretty carefully.
-- 
Boudewijn Rempt | http://www.krita.org, http://www.valdyas.org

Re: Telemetry Policy

2017-08-24 Thread Jaroslaw Staniek

On 24 August 2017 at 16:54, Nicolás Alvarez  wrote:
>
>> El 24 ago 2017, a las 07:41, Jaroslaw Staniek  escribió:
>>
>>> On 24 August 2017 at 11:10, Adriaan de Groot  wrote:
>>>
>>> Curiously, there's a lot of "telemetry policy" news items popping up this
>>> week, for instance:
>>>
>>>Mozilla ponders making telemetry opt-out, 'cos hardly anyone opted in
>>>
>>> (that's on the Register) and there were others. So it looks like 
>>> communication
>>> -- what's the data for, why is it collected, and what can happen to it -- is
>>> key here.
>>>
>>> [ade]
>>
>> Speaking of that please let me play devil's advocate. In Europe,
>> especially Poland all web sites/web apps that collect cookies must
>> obtain permission to do that from the user. Interestingly there are
>> usually OK buttons only so the message is only an information.
>> Sometimes there is "Don't agree" button which is equal to close the
>> site. So telemetry-like behavior even lacks opt-out.
>>
>> [...]
>>
>> I can imagine we would make our pages work without cookies and add
>> opt-in buttons to each main site.
>>
>> Now KDE context since there's visible call to make privacy our pillar topic:
>> 1. Does www.kde.org for example use cookies?
>
> Yes, and we show the comply-with-Europe-law banner letting the user know 
> about those cookies. We also follow the browser Do Not Track setting and we 
> don't collect statistics if that is set.
>

To meet obligations of the European law it's enough. Obvious but for
the record: if we're discussing how much we would care about privacy
-- this is delegating the responsibility to 3rdparty software. Which
may or may not have this implemented or (most often) *has tracking set
as opt-out*. Mozilla, Apple, Google for example have. MS tried to be
the good boy [1]. We and GNOME are.

[1] 
https://en.wikipedia.org/wiki/Do_Not_Track#Internet_Explorer_10_default_setting_controversy

> --
> Nicolás
> KDE Sysadmin Team



-- 
regards, Jaroslaw Staniek

KDE:
: A world-wide network of software engineers, artists, writers, translators
: and facilitators committed to Free Software development - http://kde.org
Calligra Suite:
: A graphic art and office suite - http://calligra.org
Kexi:
: A visual database apps builder - http://calligra.org/kexi
Qt Certified Specialist:
: http://www.linkedin.com/in/jstaniek

Re: Telemetry Policy

2017-08-24 Thread Nicolás Alvarez


> El 24 ago 2017, a las 07:41, Jaroslaw Staniek  escribió:
> 
>> On 24 August 2017 at 11:10, Adriaan de Groot  wrote:
>> 
>> Curiously, there's a lot of "telemetry policy" news items popping up this
>> week, for instance:
>> 
>>Mozilla ponders making telemetry opt-out, 'cos hardly anyone opted in
>> 
>> (that's on the Register) and there were others. So it looks like 
>> communication
>> -- what's the data for, why is it collected, and what can happen to it -- is
>> key here.
>> 
>> [ade]
> 
> Speaking of that please let me play devil's advocate. In Europe,
> especially Poland all web sites/web apps that collect cookies must
> obtain permission to do that from the user. Interestingly there are
> usually OK buttons only so the message is only an information.
> Sometimes there is "Don't agree" button which is equal to close the
> site. So telemetry-like behavior even lacks opt-out.
> 
> [...]
> 
> I can imagine we would make our pages work without cookies and add
> opt-in buttons to each main site.
> 
> Now KDE context since there's visible call to make privacy our pillar topic:
> 1. Does www.kde.org for example use cookies?

Yes, and we show the comply-with-Europe-law banner letting the user know about 
those cookies. We also follow the browser Do Not Track setting and we don't 
collect statistics if that is set.

-- 
Nicolás
KDE Sysadmin Team

Re: Telemetry Policy

2017-08-24 Thread Jaroslaw Staniek

On 24 August 2017 at 11:10, Adriaan de Groot  wrote:
> On Saturday 19 August 2017 12:02:03 Volker Krause wrote:
>> Good point, I clarified the intended meaning of "opt-in" in the wiki, that
>> is:  off by default and only activated by explicit action of the user
>> (inaction is not good enough).
>
> Curiously, there's a lot of "telemetry policy" news items popping up this
> week, for instance:
>
> Mozilla ponders making telemetry opt-out, 'cos hardly anyone opted in
>
> (that's on the Register) and there were others. So it looks like communication
> -- what's the data for, why is it collected, and what can happen to it -- is
> key here.
>
> [ade]

Speaking of that please let me play devil's advocate. In Europe,
especially Poland all web sites/web apps that collect cookies must
obtain permission to do that from the user. Interestingly there are
usually OK buttons only so the message is only an information.
Sometimes there is "Don't agree" button which is equal to close the
site. So telemetry-like behavior even lacks opt-out.

I mention this because it's not obvious for me why one technology of
making computer utilities has to be preferred over other technologies
wrt behaviors around telemetry. (I do call an ordinary web page as
computer utility too in this discussion because boundaries are blurred
since the day first javascript-enabled page arrived).

I can imagine we would make our pages work without cookies and add
opt-in buttons to each main site.

Now KDE context since there's visible call to make privacy our pillar topic:
1. Does www.kde.org for example use cookies?
2. Is there Privacy, Cookies, and Legal page linked from the main site
(the mentioned mozilla.org has them as well as many other sites).
OK: Legal is delegated to the e.V. page (I bet the e.V. link from
kde.org is much less informative than "Legal" link on Mozilla).
Privacy is buried on a (googleable) page
https://identity.kde.org/index.php?r=site/page&view=privacypolicy.

-- 
regards, Jaroslaw Staniek

KDE:
: A world-wide network of software engineers, artists, writers, translators
: and facilitators committed to Free Software development - http://kde.org
Calligra Suite:
: A graphic art and office suite - http://calligra.org
Kexi:
: A visual database apps builder - http://calligra.org/kexi
Qt Certified Specialist:
: http://www.linkedin.com/in/jstaniek

Re: Telemetry Policy

2017-08-24 Thread Adriaan de Groot

On Saturday 19 August 2017 12:02:03 Volker Krause wrote:
> Good point, I clarified the intended meaning of "opt-in" in the wiki, that
> is:  off by default and only activated by explicit action of the user
> (inaction is not good enough).

Curiously, there's a lot of "telemetry policy" news items popping up this 
week, for instance:

Mozilla ponders making telemetry opt-out, 'cos hardly anyone opted in

(that's on the Register) and there were others. So it looks like communication 
-- what's the data for, why is it collected, and what can happen to it -- is 
key here.

[ade]

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-21 Thread Volker Krause

On Sunday, 20 August 2017 22:29:28 CEST Jaroslaw Staniek wrote:
> On 19 August 2017 at 11:39, Volker Krause  wrote:
> > On Friday, 18 August 2017 11:23:49 CEST Jaroslaw Staniek wrote:
> > > On 17 August 2017 at 16:19, Volker Krause  wrote:
> > > > On Wednesday, 16 August 2017 20:35:59 CEST Jaroslaw Staniek wrote:
> > > My assumption when started with telemetry was having adequate level of
> > > precision. Assuming no logs are fabricated as fake interesting questions
> > > are for example: how many users actually run supported software and how
> > > many run outdated one? Not how many executions per given period of time
> > > because it may be that old software is executed by a few users very
> > > frequently for some reason. e.g. because 3 years old sofware crashes on 
> > > old OS every minute and restart was needed :)
> > > 
> > > How to know that without unique (anonymous) identification?
> > > Using extra fields such as OS+Desktop type/version would be indeed a
> > > form of cheap UID.
> > > But I would say disclosing OS+Desktop type/version for that discloses
> > > more than the anonymous random UID represents.
> > > In bugzilla and mailing list we're asking for all this information too
> > > anyway and (at least I) do not like supporting anonymous users since I
> > > am not anonymous.
> > 
> > The implementation in KUserFeedback addresses this by fixed interval data
> > submission. If you then aggregate the received data by the same interval,
> > you can see e.g. how ratios of application versions develop over time.
> > 
> > This does have limits of course, you can't distinguish between the same
> > person using the application every sampling interval, or two people using 
> > it every other interval for example. With a sufficiently long sampling 
> > interval the result should nevertheless be sufficiently accurate I think.
> 
> Volker, thanks for sharing this. I don't see how this as an approximation.
> Do you probe in given time intervals and/or measure time spent with the
> application? How do you handle time zones (e.g. zero usage of version X
> that is used only in the USA for some reason)?
> 
> KEXI sends the feedback data on startup only. I have no idea if this is
> compatible with any other approach but this helps to ignore different usage
> patterns, e.g. these two basic and typical to KEXI and many apps:
> 
> - user starts the app and keeps it open for half of the day
> - user frequently starts the app multiple times (for any reason) and has
> multiple instances open
> 
> If I remember correctly we're not measuring how long the app is used, this
> can be perceived as quite private information, by the way. Interesting data
> but so far not collected.
> 
> Moreover based on my specific experience giving up the IDs softens the data
> any more complex than app version: Alice can use module M of the app
> primarily and Bob can use module N mostly. Without IDs we have a set of
> mixed probes that include usage of both modules in no particular order
> (maybe per locale or timezone or other factor but this is not worth
> guessing IMHO). We don't even know if there are module-based preferences
> among the users.

Let's looks at a concrete example:

{
"applicationVersion": {  "value": "2.8.50"  },
"compiler": { "type": "GCC",  "version": "7.1" },
"opengl": {
"glslVersion": "1.30",
"renderer": "Haswell Mobile ",
"type": "GL",
"vendor": "Intel",
"vendorVersion": "Mesa 17.1.4",
"version": "3.0"
},
"platform": {
"os": "linux",
"version": "opensuse-tumbleweed"
},
"qtVersion": { "value": "5.9.2" },
"startCount": { "value": 34 },
"toolRatio": {
"objectinspector": { "property": 0.7619047619047619 },
"quickinspector": { "property": 0.23809523809523808 }
},
"usageTime": { "value": 12113  }
}

This is what a local GammaRay instance on this machine would currently sent 
once a week, if I enable the maximum telemetry level. "Once a week" is the 
approximation I was referring to, as that of course assumes the application is 
actually running. If it isn't, it sends at the next possibility after that. 

However, this means that when looking at all samples received in one week, I 
can be reasonably sure to only have included each installation at most once.

The data includes the number of application starts, so you can distinguish 
between frequent short users and keep it running all the time users, like you 
mentioned. Either usage pattern doesn't change general statistics though, as 
both submit the same amount of data.

The raw data stays available, aggregation only happens in the analytics tool. 
So you can of course still correlate various values (feature usage depending 
on OS or locale, for example). This also means you can see if features A and B 
are used in equal parts by all users, or half the users use primarily A and 
the other half primarily B.

GammaRay does track the usage time,

Re: Telemetry Policy

2017-08-20 Thread Jaroslaw Staniek


On 19 August 2017 at 11:39, Volker Krause  wrote:

> On Friday, 18 August 2017 11:23:49 CEST Jaroslaw Staniek wrote:
> > On 17 August 2017 at 16:19, Volker Krause  wrote:
> > > On Wednesday, 16 August 2017 20:35:59 CEST Jaroslaw Staniek wrote:
> > > > On 16 August 2017 at 18:56, Volker Krause  wrote:
> > > > > On Wednesday, 16 August 2017 15:23:07 CEST Jaroslaw Staniek wrote:
> > > > > > On 16 August 2017 at 14:13, Volker Krause 
> wrote:
> [...]
> > > > > - Kexi seems to (optionally?) contain a unique identifier
> > > >
> > > > This is mostly related to cases when any kind of cloud storage is
> used.
> > > > These cases involve unique accounts already so users can be
> identified
> > > > very well even without having telemetry functionality.
> > > >
> > > > KEXI installations limited to open-core, used away from a cloud, do
> not
> > > > need identifiers.
> > > > However I understand that identifiers, independent of network or
> host ID
> > > > (basically a random-generated QUuids) are useful for even basic
> > > > telemetry needs. Without them it's easy to abuse the system using any
> > > > kind of bots to trick us that e.g. 99% of sessions happen on KDE 1.0
> or
> > > > that given Linux distro has 90% of the global market :)
> > >
> > > Vandalism is a potential problem indeed (did you actually have issues
> with
> > > that on Kexi btw? if so, what counter-measures did you apply?).
> However I
> > > don't see how a UUID is helping here, the bot could just as well
> generate
> > > UUIDs for each submission?
> >
> > UIDs indeed can't help with too clever bots but e.g. semi-evil use cases
> > such as executing apps in batch mode can be catch. I've mostly
> encountered
> > logs coming from test machines including myself so I probably should not
> > have used the term 'bots' but (as unrealistic as it sounds) real bots can
> > be created.
>
> Ok, so that's more an accident scenario then vandalism/abuse. Wouldn't the
> more targeted counter-measure be to just disable telemetry for the
> development
> team?
>

In KEXI, in an anti-corporate fashion, we don't distinguish development
team from non-development team. All users are in the team by definition
after agreeing to support telemetry. That's one of the motivators.



>
> > > > Similarly app projects may need the IDs to answer question about most
> > > > and least used features. Most used as in "most users found it,
> > > > understood it and use it", not "most usage reports has been delivered
> > > > for it (maybe coming from a single user -- maybe even my very own co-
> > > > developer). There are many other examples probably already discussed.
> > >
> > > Sure this gets easier with unique ids, but it's not impossible without
> > > them.
> > > After all the goal here isn't to make our lives easier, but to agree on
> > > something that is acceptable for our users. And yes, that might imply
> more
> > > work and/or less accurate data.
> >
> > My assumption when started with telemetry was having adequate level of
> > precision. Assuming no logs are fabricated as fake interesting questions
> > are for example: how many users actually run supported software and how
> > many run outdated one? Not how many executions per given period of time
> > because it may be that old software is executed by a few users very
> > frequently for some reason. e.g. because 3 years old sofware crashes on
> old
> > OS every minute and restart was needed :)
> >
> > How to know that without unique (anonymous) identification?
> > Using extra fields such as OS+Desktop type/version would be indeed a form
> > of cheap UID.
> > But I would say disclosing OS+Desktop type/version for that discloses
> more
> > than the anonymous random UID represents.
> > In bugzilla and mailing list we're asking for all this information too
> > anyway and (at least I) do not like supporting anonymous users since I am
> > not anonymous.
>
> The implementation in KUserFeedback addresses this by fixed interval data
> submission. If you then aggregate the received data by the same interval,
> you
> can see e.g. how ratios of application versions develop over time.
>
> This does have limits of course, you can't distinguish between the same
> person
> using the application every sampling interval, or two people using it every
> other interval for example. With a sufficiently long sampling interval the
> result should nevertheless be sufficiently accurate I think.
>

Volker, thanks for sharing this. I don't see how this as an approximation.
Do you probe in given time intervals and/or measure time spent with the
application? How do you handle time zones (e.g. zero usage of version X
that is used only in the USA for some reason)?

KEXI sends the feedback data on startup only. I have no idea if this is
compatible with any other approach but this helps to ignore different usage
patterns, e.g. these two basic and typical to KEXI and many apps:

- user starts the app and keeps it open for half of the day
- user frequ

Re: Telemetry Policy

2017-08-19 Thread Volker Krause

On Saturday, 19 August 2017 05:37:58 CEST Nicolás Alvarez wrote:
> Enviado desde mi iPhone
> >> El 16 ago 2017, a las 20:46, Thomas Pfeiffer 
> >> escribió:
> >> 
> >> On Mittwoch, 16. August 2017 09:33:02 CEST Valorie Zimmerman wrote:
> >> Hi all, Mozilla has done a lot of work on telemetry, and we might be
> >> able to use some of their findings. On this page:
> >> https://wiki.mozilla.org/Firefox/Data_Collection they break down the
> >> data they might possibly collect into four buckets - technical (such
> >> as crashes), user interaction, web activity, and sensitive (personal
> >> data).
> >> 
> >> This bit might be relevant to our discussion: "Categories 1 & 2
> >> (Technical & Interaction data)
> >> Pre-Release & Release: Data may default on, provided the data is
> >> exclusively in these categories (it cannot be in any other category).
> >> In Release, an opt-out must be available for most types of Technical
> >> and Interaction data. "
> >> 
> >> I think the entire page might be enlightening to this discussion. I
> >> believe our analysis of needs should be more fine-grained, and that
> >> some parts of what we need can be "default on" especially for
> >> pre-release testing. For releases, we can provide an opt-out.
> > 
> > Hi Valorie,
> > Even if opt-out for some data is legally and even morally fine, it does
> > not
> > align with the values we communicate to our users:
> > Unlike Mozilla's Mission, our Vision mentions privacy explicitly, and
> > we're
> > striving to make privacy our USP.
> > 
> > Therefore I agree with others who replied in this thread: We should
> > respect
> > privacy unnecessarily much rather than too little.
> > 
> > In the end, of course, it's a matter of how we present this opt-in. If
> > it's an option buried in some settings dialog, we might as well not do it
> > at all.
> > 
> > If we, however - like Firefox does -, pfominently present that choice to
> > users the first time they run one of our applications or desktop
> > environment and try to make clear why that data collection is important
> > for us, I don't see why we could not convince a relevant number of users
> > to opt in.
> > Sure, we'll get less data than with an opt-out scheme, but let's try it
> > out
> > first before we go for the option that carries a significant PR risk.
> 
> I think discussing this as "opt-in" vs "opt-out" may be misleading. In
> terms of amount of data collected, there would be a big difference
> between going to Preferences and ticking a checkbox hidden somewhere,
> and a first-run pop up that asks you, yet both could be reasonably
> called "opt-in".
> 
> Similarly, in terms of privacy, there is a big difference between
> opt-out by un-ticking a checkbox hidden somewhere vs opt-out via a
> passive banner that says "we collect anonymous usage information,
> click here if you don't want that" before any data is sent.
> 
> In slightly different words, as said in #kde-devel:
> ‎<‎argonel‎>‎ opt-in doesn't mean "unadvertised"
> ‎<‎nicolas17‎>‎ and opt-out doesn't mean you have to dig in settings
> to turn it off
> ‎<‎nicolas17‎>‎ so I think putting it as an opt-in vs opt-out binary
> discussion is too simplistic
> 
> We need to talk about the specifics of what opt-in or opt-out *means*.
> Is there a first-use prompt? Is it a passive banner or a modal popup?
> What happens by default if I ignore it? What happens by default
> between starting the app and answering the prompt? How do I change the
> choice later?

Good point, I clarified the intended meaning of "opt-in" in the wiki, that is: 
off by default and only activated by explicit action of the user (inaction is 
not good enough).

The "advertisement" or "encouragement" (as it's called in KUserFeedback) isn't 
regulated in the current policy draft beyond obvious things like the setting 
to turn telemetry off again can't be arbitrarily hidden, and you can't force 
opt-in by tying the setting to unrelated features for example.

The current implementation in KUserFeedback is showing a passive popup after a 
bit of usage, a blocking first start dialog asking for your data would give 
the wrong impression on priorities IMHO. But that can depend on how an 
application is structured in general of course, and I'd assume PR self-
preservation to prevent overly aggressive approaches.

Regards,
Volker

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-19 Thread Volker Krause

On Friday, 18 August 2017 11:23:49 CEST Jaroslaw Staniek wrote:
> On 17 August 2017 at 16:19, Volker Krause  wrote:
> > On Wednesday, 16 August 2017 20:35:59 CEST Jaroslaw Staniek wrote:
> > > On 16 August 2017 at 18:56, Volker Krause  wrote:
> > > > On Wednesday, 16 August 2017 15:23:07 CEST Jaroslaw Staniek wrote:
> > > > > On 16 August 2017 at 14:13, Volker Krause  wrote:
[...]
> > > > - Kexi seems to (optionally?) contain a unique identifier
> > > 
> > > This is mostly related to cases when any kind of cloud storage is used.
> > > These cases involve unique accounts already so users can be identified
> > > very well even without having telemetry functionality.
> > > 
> > > KEXI installations limited to open-core, used away from a cloud, do not
> > > need identifiers.
> > > However I understand that identifiers, independent of network or host ID
> > > (basically a random-generated QUuids) are useful for even basic
> > > telemetry needs. Without them it's easy to abuse the system using any
> > > kind of bots to trick us that e.g. 99% of sessions happen on KDE 1.0 or 
> > > that given Linux distro has 90% of the global market :)
> > 
> > Vandalism is a potential problem indeed (did you actually have issues with
> > that on Kexi btw? if so, what counter-measures did you apply?). However I
> > don't see how a UUID is helping here, the bot could just as well generate
> > UUIDs for each submission?
> 
> UIDs indeed can't help with too clever bots but e.g. semi-evil use cases
> such as executing apps in batch mode can be catch. I've mostly encountered
> logs coming from test machines including myself so I probably should not
> have used the term 'bots' but (as unrealistic as it sounds) real bots can
> be created.

Ok, so that's more an accident scenario then vandalism/abuse. Wouldn't the 
more targeted counter-measure be to just disable telemetry for the development 
team?

> > > Similarly app projects may need the IDs to answer question about most
> > > and least used features. Most used as in "most users found it, 
> > > understood it and use it", not "most usage reports has been delivered 
> > > for it (maybe coming from a single user -- maybe even my very own co-
> > > developer). There are many other examples probably already discussed.
> > 
> > Sure this gets easier with unique ids, but it's not impossible without
> > them.
> > After all the goal here isn't to make our lives easier, but to agree on
> > something that is acceptable for our users. And yes, that might imply more
> > work and/or less accurate data.
> 
> My assumption when started with telemetry was having adequate level of
> precision. Assuming no logs are fabricated as fake interesting questions
> are for example: how many users actually run supported software and how
> many run outdated one? Not how many executions per given period of time
> because it may be that old software is executed by a few users very
> frequently for some reason. e.g. because 3 years old sofware crashes on old
> OS every minute and restart was needed :)
> 
> How to know that without unique (anonymous) identification?
> Using extra fields such as OS+Desktop type/version would be indeed a form
> of cheap UID.
> But I would say disclosing OS+Desktop type/version for that discloses more
> than the anonymous random UID represents.
> In bugzilla and mailing list we're asking for all this information too
> anyway and (at least I) do not like supporting anonymous users since I am
> not anonymous.

The implementation in KUserFeedback addresses this by fixed interval data 
submission. If you then aggregate the received data by the same interval, you 
can see e.g. how ratios of application versions develop over time.

This does have limits of course, you can't distinguish between the same person 
using the application every sampling interval, or two people using it every 
other interval for example. With a sufficiently long sampling interval the 
result should nevertheless be sufficiently accurate I think.

> BTW, it's worth to remind, the UID is not even a hash of any host and user
> info, it's a random number. I do admit that "hash of a host and user info"
> would be even better as it allows to recreate the UID after e.g. OS has
> been reinstalled or new account created. But I do not use hashing for KEXI
> anyway.
> 
> > > Thus I would see the Anonymity is covered by KEXI's approach except that
> > > it offers opt-in tracking of unique user for unique installations. KEXI
> > > currently does not track unique installations at all until the user 
> > > agrees for any telemetry (the KexiUserFeedbackAgent::
> > > AnonymousIdentificationArea value). This is required by nature of stats 
> > > computed (and abuses mentioned above are the reason).
> > > 
> > > Is this a big deal? We're close to philosophy area here.
> > 
> > Correct, this is about the philosophy behind our products :) And one very
> > core part of that happens to be privacy.
> > 
> > That basically leaves the que

Re: Telemetry Policy

2017-08-18 Thread Nicolás Alvarez

Enviado desde mi iPhone
>> El 16 ago 2017, a las 20:46, Thomas Pfeiffer  
>> escribió:
>>
>> On Mittwoch, 16. August 2017 09:33:02 CEST Valorie Zimmerman wrote:
>> Hi all, Mozilla has done a lot of work on telemetry, and we might be
>> able to use some of their findings. On this page:
>> https://wiki.mozilla.org/Firefox/Data_Collection they break down the
>> data they might possibly collect into four buckets - technical (such
>> as crashes), user interaction, web activity, and sensitive (personal
>> data).
>>
>> This bit might be relevant to our discussion: "Categories 1 & 2
>> (Technical & Interaction data)
>> Pre-Release & Release: Data may default on, provided the data is
>> exclusively in these categories (it cannot be in any other category).
>> In Release, an opt-out must be available for most types of Technical
>> and Interaction data. "
>>
>> I think the entire page might be enlightening to this discussion. I
>> believe our analysis of needs should be more fine-grained, and that
>> some parts of what we need can be "default on" especially for
>> pre-release testing. For releases, we can provide an opt-out.
>
> Hi Valorie,
> Even if opt-out for some data is legally and even morally fine, it does not
> align with the values we communicate to our users:
> Unlike Mozilla's Mission, our Vision mentions privacy explicitly, and we're
> striving to make privacy our USP.
>
> Therefore I agree with others who replied in this thread: We should respect
> privacy unnecessarily much rather than too little.
>
> In the end, of course, it's a matter of how we present this opt-in. If it's an
> option buried in some settings dialog, we might as well not do it at all.
>
> If we, however - like Firefox does -, pfominently present that choice to users
> the first time they run one of our applications or desktop environment and try
> to make clear why that data collection is important for us, I don't see why we
> could not convince a relevant number of users to opt in.
> Sure, we'll get less data than with an opt-out scheme, but let's try it out
> first before we go for the option that carries a significant PR risk.

I think discussing this as "opt-in" vs "opt-out" may be misleading. In
terms of amount of data collected, there would be a big difference
between going to Preferences and ticking a checkbox hidden somewhere,
and a first-run pop up that asks you, yet both could be reasonably
called "opt-in".

Similarly, in terms of privacy, there is a big difference between
opt-out by un-ticking a checkbox hidden somewhere vs opt-out via a
passive banner that says "we collect anonymous usage information,
click here if you don't want that" before any data is sent.

In slightly different words, as said in #kde-devel:
‎<‎argonel‎>‎ opt-in doesn't mean "unadvertised"
‎<‎nicolas17‎>‎ and opt-out doesn't mean you have to dig in settings
to turn it off
‎<‎nicolas17‎>‎ so I think putting it as an opt-in vs opt-out binary
discussion is too simplistic

We need to talk about the specifics of what opt-in or opt-out *means*.
Is there a first-use prompt? Is it a passive banner or a modal popup?
What happens by default if I ignore it? What happens by default
between starting the app and answering the prompt? How do I change the
choice later?

-- 
Nicolás

Re: Telemetry Policy

2017-08-18 Thread Jaroslaw Staniek

On 17 August 2017 at 16:19, Volker Krause  wrote:

> On Wednesday, 16 August 2017 20:35:59 CEST Jaroslaw Staniek wrote:
> > On 16 August 2017 at 18:56, Volker Krause  wrote:
> > > On Wednesday, 16 August 2017 15:23:07 CEST Jaroslaw Staniek wrote:
> > > > On 16 August 2017 at 14:13, Volker Krause  wrote:
> [...]
> > > > In addition maybe distributors can sometimes make the decision based
> on
> > > > opinions from given subprojects.
> > > > For example the option would be pre-set to ON in KEXI's installer for
> > > > Mac and  Windows itself and for Linux AppImages, not in the source
> code.
> > > > Just saying, KEXI has not yet switched to the new framework :)
> > >
> > > The policy we are discussing here is (and is supposed to be)
> independent
> > > of the implementation. And that's not just theoretical, Kexi is one
> > > prominent case for an alternative implementation, and the Krita GSoC
> also
> > > seems to contain some alternative server code for example. So input in
> > > particular from those teams matters a lot for me, as this policy in its
> > > current form would affect them too.
> > >
> > > And a policy we only adhere in code and work around in the end by
> putting
> > > on a distributor hat (which we can in many places, as your examples
> show)
> > > isn't really helping, I'd much rather have it reflect what we actually
> do
> > > :)
> > >
> > > From having read the code of both, I think the only possible points of
> > > conflict with the policy draft might be:
> > > - opt-in
> >
> > Source code has 100% of opt-in (grep for
> > 'areas(KexiUserFeedbackAgent::NoAreas)'). Anyone is free to change this
> > default and create distribution under his name and I understand this will
> > not be a "distribution by KDE".
>
> Great, so I just misread both Krita's and Kexi's requirements here and we
> don't have a problem :)
>

In case of KEXI the idea from the very beginning was to make also some
distros happy (avoid the need of patching the source).


> > > - hosted on KDE infrastructure
> >
> > My assumption: As KEXI is an open-core+whatever-license-for-plugins
> > architecture, ultimately the telemetry information from KEXI users would
> be
> > better hosted by KDE. Any extra information retrieved by plugins (if that
> > even exists) can be hosted elsewhere but and this is a responsibility of
> > plugin developers.
>
> Yep, I'd say 3rd party addons are encouraged to follow the same policy,
> just
> like distributors, but we have no way of actually enforcing it.
>
> > > - Kexi seems to (optionally?) contain a unique identifier
> >
> > This is mostly related to cases when any kind of cloud storage is used.
> > These cases involve unique accounts already so users can be identified
> very
> > well even without having telemetry functionality.
> >
> > KEXI installations limited to open-core, used away from a cloud, do not
> > need identifiers.
> > However I understand that identifiers, independent of network or host ID
> > (basically a random-generated QUuids) are useful for even basic telemetry
> > needs. Without them it's easy to abuse the system using any kind of bots
> to
> > trick us that e.g. 99% of sessions happen on KDE 1.0 or that given Linux
> > distro has 90% of the global market :)
>
> Vandalism is a potential problem indeed (did you actually have issues with
> that on Kexi btw? if so, what counter-measures did you apply?). However I
> don't see how a UUID is helping here, the bot could just as well generate
> UUIDs for each submission?
>

UIDs indeed can't help with too clever bots but e.g. semi-evil use cases
such as executing apps in batch mode can be catch. I've mostly encountered
logs coming from test machines including myself so I probably should not
have used the term 'bots' but (as unrealistic as it sounds) real bots can
be created.


> > Similarly app projects may need the IDs to answer question about most and
> > least used features. Most used as in "most users found it, understood it
> > and use it", not "most usage reports has been delivered for it (maybe
> > coming from a single user -- maybe even my very own co-developer). There
> > are many other examples probably already discussed.
>
> Sure this gets easier with unique ids, but it's not impossible without
> them.
> After all the goal here isn't to make our lives easier, but to agree on
> something that is acceptable for our users. And yes, that might imply more
> work and/or less accurate data.
>

My assumption when started with telemetry was having adequate level of
precision. Assuming no logs are fabricated as fake interesting questions
are for example: how many users actually run supported software and how
many run outdated one? Not how many executions per given period of time
because it may be that old software is executed by a few users very
frequently for some reason. e.g. because 3 years old sofware crashes on old
OS every minute and restart was needed :)

How to know that without unique (anonymous) identification?
Using e

Re: Telemetry Policy

2017-08-17 Thread Jaroslaw Staniek

On 17 August 2017 at 18:20, Thomas Pfeiffer  wrote:

>
> On 17. Aug 2017, at 17:38, Mirko Boehm - KDE  wrote:
>
> Hi,
>
> On 17. Aug 2017, at 01:46, Thomas Pfeiffer 
> wrote:
>
> Hi Valorie,
> Even if opt-out for some data is legally and even morally fine, it does not
>
> align with the values we communicate to our users:
> Unlike Mozilla's Mission, our Vision mentions privacy explicitly, and we're
>
> striving to make privacy our USP.
>
>
> We seem to assume a contradiction between telemetry and privacy. I believe
> this is a knee-jerk reaction. We can implement telemetry in a way that
> privacy is not violated. In fact, I would say that it follows from our
> vision that we should do this.
>
>
> The problem is: I expect users to have the same knee-jerk reaction. I
> don’t see us being able to explain to users that actually their privacy is
> perfectly safe before they freak out.
> Privacy-minded Free Software users have freaked out in the past over
> things which objectively speaking were not a huge deal.
> It’s emotion more than rational arguments
>
>
It's hard to argue here or generalize to all app's communities. Krita
community for example is different than gcc community in these aspects.

-- 
regards, Jaroslaw Staniek

KDE:
: A world-wide network of software engineers, artists, writers, translators
: and facilitators committed to Free Software development - http://kde.org
Calligra Suite:
: A graphic art and office suite - http://calligra.org
Kexi:
: A visual database apps builder - http://calligra.org/kexi
Qt Certified Specialist:
: http://www.linkedin.com/in/jstaniek

Re: Telemetry Policy

2017-08-17 Thread Thomas Pfeiffer

> On 17. Aug 2017, at 17:38, Mirko Boehm - KDE  wrote:
> 
> Hi, 
> 
>> On 17. Aug 2017, at 01:46, Thomas Pfeiffer > > wrote:
>> 
>> Hi Valorie,
>> Even if opt-out for some data is legally and even morally fine, it does not 
>> align with the values we communicate to our users:
>> Unlike Mozilla's Mission, our Vision mentions privacy explicitly, and we're 
>> striving to make privacy our USP.
> 
> We seem to assume a contradiction between telemetry and privacy. I believe 
> this is a knee-jerk reaction. We can implement telemetry in a way that 
> privacy is not violated. In fact, I would say that it follows from our vision 
> that we should do this.
> 

The problem is: I expect users to have the same knee-jerk reaction. I don’t see 
us being able to explain to users that actually their privacy is perfectly safe 
before they freak out.
Privacy-minded Free Software users have freaked out in the past over things 
which objectively speaking were not a huge deal.
It’s emotion more than rational arguments

Re: Telemetry Policy

2017-08-17 Thread Mirko Boehm - KDE

Hi,

> On 17. Aug 2017, at 01:46, Thomas Pfeiffer  wrote:
> 
> Hi Valorie,
> Even if opt-out for some data is legally and even morally fine, it does not
> align with the values we communicate to our users:
> Unlike Mozilla's Mission, our Vision mentions privacy explicitly, and we're
> striving to make privacy our USP.

We seem to assume a contradiction between telemetry and privacy. I believe this 
is a knee-jerk reaction. We can implement telemetry in a way that privacy is 
not violated. In fact, I would say that it follows from our vision that we 
should do this.

Cheers,

Mirko.
--
Mirko Boehm | mi...@kde.org | KDE e.V.
FSFE Fellowship Representative, FSFE Team Germany
Qt Certified Specialist and Trainer
Request a meeting: https://doodle.com/mirkoboehm

signature.asc
Description: Message signed with OpenPGP

Re: Telemetry Policy

2017-08-17 Thread Volker Krause

On Wednesday, 16 August 2017 20:35:59 CEST Jaroslaw Staniek wrote:
> On 16 August 2017 at 18:56, Volker Krause  wrote:
> > On Wednesday, 16 August 2017 15:23:07 CEST Jaroslaw Staniek wrote:
> > > On 16 August 2017 at 14:13, Volker Krause  wrote:
[...]
> > > In addition maybe distributors can sometimes make the decision based on 
> > > opinions from given subprojects.
> > > For example the option would be pre-set to ON in KEXI's installer for 
> > > Mac and  Windows itself and for Linux AppImages, not in the source code.
> > > Just saying, KEXI has not yet switched to the new framework :)
> > 
> > The policy we are discussing here is (and is supposed to be) independent
> > of the implementation. And that's not just theoretical, Kexi is one 
> > prominent case for an alternative implementation, and the Krita GSoC also 
> > seems to contain some alternative server code for example. So input in 
> > particular from those teams matters a lot for me, as this policy in its 
> > current form would affect them too.
> > 
> > And a policy we only adhere in code and work around in the end by putting
> > on a distributor hat (which we can in many places, as your examples show) 
> > isn't really helping, I'd much rather have it reflect what we actually do 
> > :)
> > 
> > From having read the code of both, I think the only possible points of
> > conflict with the policy draft might be:
> > - opt-in
> 
> Source code has 100% of opt-in (grep for
> 'areas(KexiUserFeedbackAgent::NoAreas)'). Anyone is free to change this
> default and create distribution under his name and I understand this will
> not be a "distribution by KDE".

Great, so I just misread both Krita's and Kexi's requirements here and we 
don't have a problem :)

> > - hosted on KDE infrastructure
> 
> My assumption: As KEXI is an open-core+whatever-license-for-plugins
> architecture, ultimately the telemetry information from KEXI users would be
> better hosted by KDE. Any extra information retrieved by plugins (if that
> even exists) can be hosted elsewhere but and this is a responsibility of
> plugin developers.

Yep, I'd say 3rd party addons are encouraged to follow the same policy, just 
like distributors, but we have no way of actually enforcing it.

> > - Kexi seems to (optionally?) contain a unique identifier
> 
> This is mostly related to cases when any kind of cloud storage is used.
> These cases involve unique accounts already so users can be identified very
> well even without having telemetry functionality.
> 
> KEXI installations limited to open-core, used away from a cloud, do not
> need identifiers.
> However I understand that identifiers, independent of network or host ID
> (basically a random-generated QUuids) are useful for even basic telemetry
> needs. Without them it's easy to abuse the system using any kind of bots to
> trick us that e.g. 99% of sessions happen on KDE 1.0 or that given Linux
> distro has 90% of the global market :)

Vandalism is a potential problem indeed (did you actually have issues with 
that on Kexi btw? if so, what counter-measures did you apply?). However I 
don't see how a UUID is helping here, the bot could just as well generate 
UUIDs for each submission?

> Similarly app projects may need the IDs to answer question about most and
> least used features. Most used as in "most users found it, understood it
> and use it", not "most usage reports has been delivered for it (maybe
> coming from a single user -- maybe even my very own co-developer). There
> are many other examples probably already discussed.

Sure this gets easier with unique ids, but it's not impossible without them. 
After all the goal here isn't to make our lives easier, but to agree on 
something that is acceptable for our users. And yes, that might imply more 
work and/or less accurate data.

> Thus I would see the Anonymity is covered by KEXI's approach except that it
> offers opt-in tracking of unique user for unique installations. KEXI
> currently does not track unique installations at all until the user agrees
> for any telemetry (the KexiUserFeedbackAgent::AnonymousIdentificationArea
> value). This is required by nature of stats computed (and abuses mentioned
> above are the reason).
> 
> Is this a big deal? We're close to philosophy area here. 

Correct, this is about the philosophy behind our products :) And one very core 
part of that happens to be privacy.

That basically leaves the question: do we want to additionally allow the opt-
in use of unique identifiers?

> Before designing the stats engine I guessed: not more than installing an 
> email app or buying a SIM card and starting to use them; they allow me to 
> send email or make a call using protocols that disclose quite a bit about 
> me. 

Sure, but that is also where we can differentiate. Just because other 
applications weren't designed with privacy in mind doesn't mean we should 
follow their example IMHO.

Regards,
Volker

> I would respect
> users that disagree with that but th

Re: Telemetry Policy

2017-08-16 Thread Thomas Pfeiffer

On Mittwoch, 16. August 2017 09:33:02 CEST Valorie Zimmerman wrote:
> Hi all, Mozilla has done a lot of work on telemetry, and we might be
> able to use some of their findings. On this page:
> https://wiki.mozilla.org/Firefox/Data_Collection they break down the
> data they might possibly collect into four buckets - technical (such
> as crashes), user interaction, web activity, and sensitive (personal
> data).
> 
> This bit might be relevant to our discussion: "Categories 1 & 2
> (Technical & Interaction data)
> Pre-Release & Release: Data may default on, provided the data is
> exclusively in these categories (it cannot be in any other category).
> In Release, an opt-out must be available for most types of Technical
> and Interaction data. "
> 
> I think the entire page might be enlightening to this discussion. I
> believe our analysis of needs should be more fine-grained, and that
> some parts of what we need can be "default on" especially for
> pre-release testing. For releases, we can provide an opt-out.

Hi Valorie,
Even if opt-out for some data is legally and even morally fine, it does not 
align with the values we communicate to our users:
Unlike Mozilla's Mission, our Vision mentions privacy explicitly, and we're 
striving to make privacy our USP.

Therefore I agree with others who replied in this thread: We should respect 
privacy unnecessarily much rather than too little.

In the end, of course, it's a matter of how we present this opt-in. If it's an 
option buried in some settings dialog, we might as well not do it at all.

If we, however - like Firefox does -, pfominently present that choice to users 
the first time they run one of our applications or desktop environment and try 
to make clear why that data collection is important for us, I don't see why we 
could not convince a relevant number of users to opt in.
Sure, we'll get less data than with an opt-out scheme, but let's try it out 
first before we go for the option that carries a significant PR risk.

> Other more sensitive data will need to be opt-in. I think it's a
> mistake to treat all the data we might want all in the same way.

Content (web activity for Mozilla) and personal information should not be opt-
anything but not collected at all.

Cheers,
Thomas

Re: Telemetry Policy

2017-08-16 Thread Jaroslaw Staniek

On 16 August 2017 at 18:56, Volker Krause  wrote:

> On Wednesday, 16 August 2017 15:23:07 CEST Jaroslaw Staniek wrote:
> > On 16 August 2017 at 14:13, Volker Krause  wrote:
> > > On Wednesday, 16 August 2017 09:33:02 CEST Valorie Zimmerman wrote:
> > > > Hi all, Mozilla has done a lot of work on telemetry, and we might be
> > > > able to use some of their findings. On this page:
> > > > https://wiki.mozilla.org/Firefox/Data_Collection they break down the
> > > > data they might possibly collect into four buckets - technical (such
> > > > as crashes), user interaction, web activity, and sensitive (personal
> > > > data).
> > >
> > > without making it that explicit, we basically have the same four
> > > categories of
> > > data too, and explicitly exclude the use of category 3 and 4, ie user
> > > content/
> > > activity and personal data, only technical and interaction data are
> > > allowed to
> > > be used (category 1 and 2).
> > >
> > > > This bit might be relevant to our discussion: "Categories 1 & 2
> > > > (Technical & Interaction data)
> > > > Pre-Release & Release: Data may default on, provided the data is
> > > > exclusively in these categories (it cannot be in any other category).
> > > > In Release, an opt-out must be available for most types of Technical
> > > > and Interaction data. "
> > > >
> > > > I think the entire page might be enlightening to this discussion. I
> > > > believe our analysis of needs should be more fine-grained, and that
> > > > some parts of what we need can be "default on" especially for
> > > > pre-release testing. For releases, we can provide an opt-out.
> > > >
> > > > Other more sensitive data will need to be opt-in. I think it's a
> > > > mistake to treat all the data we might want all in the same way.
> > >
> > > This again brings up opt-out, which so far doesn't seem to have a
> chance
> > > for
> > > consensus. Can we defer this to when we have some more experience with
> the
> > > opt-in approach and how much participation we get with that? Or are
> people
> > > feeling this would too strongly limit what they are allowed to do in
> their
> > > applications?
> >
> > In addition maybe distributors can 
> > 
> > sometimes make the decision based on opinions from given subprojects.
> > For example
> > the option would be pre-set to ON in KEXI's installer for Mac and
> Windows
> > itself and for Linux AppImages, not in the source code.
> > Just saying, KEXI has not yet switched to the new framework :)
>
> The policy we are discussing here is (and is supposed to be) independent of
> the implementation. And that's not just theoretical, Kexi is one prominent
> case for an alternative implementation, and the Krita GSoC also seems to
> contain some alternative server code for example. So input in particular
> from
> those teams matters a lot for me, as this policy in its current form would
> affect them too.
>
> And a policy we only adhere in code and work around in the end by putting
> on a
> distributor hat (which we can in many places, as your examples show) isn't
> really helping, I'd much rather have it reflect what we actually do :)
>
> From having read the code of both, I think the only possible points of
> conflict with the policy draft might be:
> - opt-in
>

Source code has 100% of opt-in (grep for
'areas(KexiUserFeedbackAgent::NoAreas)'). Anyone is free to change this
default and create distribution under his name and I understand this will
not be a "distribution by KDE".



> - hosted on KDE infrastructure
>

My assumption: As KEXI is an open-core+whatever-license-for-plugins
architecture, ultimately the telemetry information from KEXI users would be
better hosted by KDE. Any extra information retrieved by plugins (if that
even exists) can be hosted elsewhere but and this is a responsibility of
plugin developers.



> - Kexi seems to (optionally?) contain a unique identifier
>

This is mostly related to cases when any kind of cloud storage is used.
These cases involve unique accounts already so users can be identified very
well even without having telemetry functionality.

KEXI installations limited to open-core, used away from a cloud, do not
need identifiers.
However I understand that identifiers, independent of network or host ID
(basically a random-generated QUuids) are useful for even basic telemetry
needs. Without them it's easy to abuse the system using any kind of bots to
trick us that e.g. 99% of sessions happen on KDE 1.0 or that given Linux
distro has 90% of the global market :)

Similarly app projects may need the IDs to answer question about most and
least used features. Most used as in "most users found it, understood it
and use it", not "most usage reports has been delivered for it (maybe
coming from a single user -- maybe even my very own co-developer). There
are many other examples probably already discussed.

Thus I would see the Anonymity is covered by KEXI's approach except that it
offers opt-in tracking of unique user for unique in

Re: Telemetry Policy

2017-08-16 Thread Volker Krause

On Wednesday, 16 August 2017 11:31:31 CEST Mirko Boehm - KDE wrote:
> before this gets completely out of hand: The cited German data protection
> regulations are often misunderstood, even by people that pose as experts.
> They are also often (mis-)used as killer arguments to support political or
> personal opinions. If we start collecting telemetry data, we should get an
> assessment by a lawyer (!) that the way we handle the data is correct.

yep, let's bring this up on e.V. level once we have agreement on what we 
actually want to do in detail, there are still some very relevant questions 
for this open (such as the publication/licensing of the data).

> However, it can certainly be done correctly and in a way that protects
> individual privacy and supports the improvement of our software.
> 
> Technical argument: If IP addresses are a concern, would it be an option to
> run them through a one-way hash function on the client side before
> submitting the data?

It's not about IP addresses we submit (we shouldn't be doing that in the first 
place), but about the one the web server sees inevitably when receiving data. 
We can't avoid that, we just need to make sure this is kept separate from 
telemetry, to avoid being "tainted" as personal data.

Regards,
Volker


> On Wed, Aug 16, 2017 at 11:08 AM Volker Krause  > wrote:
> On Wednesday, 16 August 2017 10:21:11 CEST Ben Cooksley wrote:
> > On Mon, Aug 14, 2017 at 11:40 PM, Volker Krause mailto:vkra...@kde.org>> wrote:
> > > I agree on the proposed wording changes, so focusing on your technical
> > > points below.
> > > 
> > > On Monday, 14 August 2017 11:53:17 CEST Ben Cooksley wrote:
> > >> I've got two technical notes here:
> > >> 
> > >> 1) All products should fetch details on where to submit telemetry data
> > >> from an online configuration file similar to
> > >> https://autoconfig.kde.org/ocs/providers.xml
> > >> 
> > >> 
> > >> This would give us the capacity to version the telemetry server api,
> > >> and potentially even "kill" telemetry submissions from older
> > >> application versions if needed.
> > >> 
> > >> 2) No software product should use the QNetworkAccessManager family of
> > >> classes due to known defects in it's operation within some versions of
> > >> Qt which cause infrastructure problems.
> > > 
> > > The current implementation uses QNAM, but actually has code to handle
> > > HTTP
> > > redirects correctly (with unit test coverage), I assume that's the issue
> > > you are referring to? This also has been tested all the way back to
> > > Qt4.8
> > > as part of the existing deployment in GammaRay.
> > 
> > That's one of the considerations yes. I'm hopeful that nothing else in
> > it will be found to be broken behaviour wise but have much more faith
> > in KIO here.
> > 
> > > I don't mind adding the extra indirection with the configuration file,
> > > although just from the XML I don't see yet what that would provide
> > > beyond
> > > HTTP redirects. Are there certain information (e.g. the app version)
> > > passed already as part of the request for the configuration file? Or can
> > > there be conditional aspects not currently present in the above example?
> > 
> > The extra indirection is basically to give us the option to shift the
> > endpoint elsewhere at some point without having to keep the old one
> > alive even as a redirect.
> 
> Isn't that just shifting the requirement for the "stable" endpoint to the
> configuration one? But if that's easier we can of course add that. Are there
> any formats/standards you have in mind for this, or any parameters the GET
> request should contain?
> 
> > I'm also concerned that we could potentially run into issues if the
> > system doesn't do any GET requests. From what I recall unless the
> > server and client support a specific RFC then redirecting POST
> > requests isn't something one can rely on here (your code might handle
> > this properly, I certainly wouldn't trust QNAM to do so given their
> > stance on optional behaviour in HTTP RFCs)
> 
> Correct, QNAM doesn't support POST redirects itself. But since we deal with
> redirects ourselves anyway, that's not really an issue. On the server I
> haven't run into issues yet, even the super primitive HTTP test server built
> into PHP can handle it. POST redirects aren't particularly elegant though,
> as you are sending the payload multiple times. So the extra GET might be a
> better solution anyway.
> 
> Regards,
> Volker
> 
> 
> --
> Mirko Boehm | mi...@kde.org | KDE e.V.
> FSFE Fellowship Representative, FSFE Team Germany
> Qt Certified Specialist and Trainer
> Request a meeting: https://doodle.com/mirkoboehm



signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-16 Thread Boudewijn Rempt

On Wed, 16 Aug 2017, Volker Krause wrote:

> The policy we are discussing here is (and is supposed to be) independent of 
> the implementation. And that's not just theoretical, Kexi is one prominent 
> case for an alternative implementation, and the Krita GSoC also seems to 
> contain some alternative server code for example. 

That, btw, is not what _I_ as Krita maintainer want, and before going live,
it probably needs changing.

-- 
Boudewijn Rempt | http://www.krita.org, http://www.valdyas.org

Re: Telemetry Policy

2017-08-16 Thread Volker Krause

On Wednesday, 16 August 2017 15:23:07 CEST Jaroslaw Staniek wrote:
> On 16 August 2017 at 14:13, Volker Krause  wrote:
> > On Wednesday, 16 August 2017 09:33:02 CEST Valorie Zimmerman wrote:
> > > Hi all, Mozilla has done a lot of work on telemetry, and we might be
> > > able to use some of their findings. On this page:
> > > https://wiki.mozilla.org/Firefox/Data_Collection they break down the
> > > data they might possibly collect into four buckets - technical (such
> > > as crashes), user interaction, web activity, and sensitive (personal
> > > data).
> > 
> > without making it that explicit, we basically have the same four
> > categories of
> > data too, and explicitly exclude the use of category 3 and 4, ie user
> > content/
> > activity and personal data, only technical and interaction data are
> > allowed to
> > be used (category 1 and 2).
> > 
> > > This bit might be relevant to our discussion: "Categories 1 & 2
> > > (Technical & Interaction data)
> > > Pre-Release & Release: Data may default on, provided the data is
> > > exclusively in these categories (it cannot be in any other category).
> > > In Release, an opt-out must be available for most types of Technical
> > > and Interaction data. "
> > > 
> > > I think the entire page might be enlightening to this discussion. I
> > > believe our analysis of needs should be more fine-grained, and that
> > > some parts of what we need can be "default on" especially for
> > > pre-release testing. For releases, we can provide an opt-out.
> > > 
> > > Other more sensitive data will need to be opt-in. I think it's a
> > > mistake to treat all the data we might want all in the same way.
> > 
> > This again brings up opt-out, which so far doesn't seem to have a chance
> > for
> > consensus. Can we defer this to when we have some more experience with the
> > opt-in approach and how much participation we get with that? Or are people
> > feeling this would too strongly limit what they are allowed to do in their
> > applications?
> 
> In addition maybe distributors can 
> 
> sometimes make the decision based on opinions from given subprojects.
> For example
> the option would be pre-set to ON in KEXI's installer for Mac and Windows
> itself and for Linux AppImages, not in the source code.
> Just saying, KEXI has not yet switched to the new framework :)

The policy we are discussing here is (and is supposed to be) independent of 
the implementation. And that's not just theoretical, Kexi is one prominent 
case for an alternative implementation, and the Krita GSoC also seems to 
contain some alternative server code for example. So input in particular from 
those teams matters a lot for me, as this policy in its current form would 
affect them too.

And a policy we only adhere in code and work around in the end by putting on a 
distributor hat (which we can in many places, as your examples show) isn't 
really helping, I'd much rather have it reflect what we actually do :)

From having read the code of both, I think the only possible points of 
conflict with the policy draft might be:
- opt-in
- hosted on KDE infrastructure
- Kexi seems to (optionally?) contain a unique identifier

Regards,
Volker

> > Seeing yesterday's blog from the Krita team (https://akapust1n.github.io/
> > 2017-08-15-sixth-blog-gsoc-2017/), I'd particularly be interested in their
> > view on this.
> > 
> > Regards,
> > Volker
> > 
> > > On Sun, Aug 13, 2017 at 3:18 AM, Christian Loosli
> > > 
> > >  wrote:
> > > > Hi,
> > > > 
> > > > thank you very much for this work, sounds great!
> > > > 
> > > > Only point I have: maybe make sure that the opt-in / default settings
> > 
> > are
> > 
> > > > not only mandatory for application developers, but also for packagers
> > > > /
> > > > distributions.
> > > > 
> > > > Some distributions have rather questionable views on privacy and by
> > > > default
> > > > sent information to third parties, so I would feel much more safe if
> > 
> > they
> > 
> > > > weren't allowed (in theory) to flick the switch in their package by
> > > > default to "on" either.
> > > > 
> > > > Kind regards,
> > > > 
> > > > Christian


signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-16 Thread Boudewijn Rempt

On Wed, 16 Aug 2017, Volker Krause wrote:

> Seeing yesterday's blog from the Krita team (https://akapust1n.github.io/
> 2017-08-15-sixth-blog-gsoc-2017/), I'd particularly be interested in their 
> view on this.

I've pointed alexey at this thread, but there's a huge language barrier:
he basically communicates through google translate. I've made it a firm
condition of merging and operating the telemetry that we adhere to the
KDE policy in every way. Even then, I still consider his work to be
an experimental research project.

-- 
Boudewijn Rempt | http://www.krita.org, http://www.valdyas.org

Re: Telemetry Policy

2017-08-16 Thread Jaroslaw Staniek

On 16 August 2017 at 14:13, Volker Krause  wrote:

> On Wednesday, 16 August 2017 09:33:02 CEST Valorie Zimmerman wrote:
> > Hi all, Mozilla has done a lot of work on telemetry, and we might be
> > able to use some of their findings. On this page:
> > https://wiki.mozilla.org/Firefox/Data_Collection they break down the
> > data they might possibly collect into four buckets - technical (such
> > as crashes), user interaction, web activity, and sensitive (personal
> > data).
>
> without making it that explicit, we basically have the same four
> categories of
> data too, and explicitly exclude the use of category 3 and 4, ie user
> content/
> activity and personal data, only technical and interaction data are
> allowed to
> be used (category 1 and 2).
>
> > This bit might be relevant to our discussion: "Categories 1 & 2
> > (Technical & Interaction data)
> > Pre-Release & Release: Data may default on, provided the data is
> > exclusively in these categories (it cannot be in any other category).
> > In Release, an opt-out must be available for most types of Technical
> > and Interaction data. "
> >
> > I think the entire page might be enlightening to this discussion. I
> > believe our analysis of needs should be more fine-grained, and that
> > some parts of what we need can be "default on" especially for
> > pre-release testing. For releases, we can provide an opt-out.
> >
> > Other more sensitive data will need to be opt-in. I think it's a
> > mistake to treat all the data we might want all in the same way.
>
> This again brings up opt-out, which so far doesn't seem to have a chance
> for
> consensus. Can we defer this to when we have some more experience with the
> opt-in approach and how much participation we get with that? Or are people
> feeling this would too strongly limit what they are allowed to do in their
> applications?
>

In addition maybe distributors can 

sometimes make the decision based on opinions from given subprojects.
For example
the option would be pre-set to ON in KEXI's installer for Mac and Windows
itself and for Linux AppImages, not in the source code.
Just saying, KEXI has not yet switched to the new framework :)


> Seeing yesterday's blog from the Krita team (https://akapust1n.github.io/
> 2017-08-15-sixth-blog-gsoc-2017/), I'd particularly be interested in their
> view on this.
>
> Regards,
> Volker
>
> > On Sun, Aug 13, 2017 at 3:18 AM, Christian Loosli
> >
> >  wrote:
> > > Hi,
> > >
> > > thank you very much for this work, sounds great!
> > >
> > > Only point I have: maybe make sure that the opt-in / default settings
> are
> > > not only mandatory for application developers, but also for packagers /
> > > distributions.
> > >
> > > Some distributions have rather questionable views on privacy and by
> > > default
> > > sent information to third parties, so I would feel much more safe if
> they
> > > weren't allowed (in theory) to flick the switch in their package by
> > > default to "on" either.
> > >
> > > Kind regards,
> > >
> > > Christian
>
>
>


-- 
regards, Jaroslaw Staniek

KDE:
: A world-wide network of software engineers, artists, writers, translators
: and facilitators committed to Free Software development - http://kde.org
Calligra Suite:
: A graphic art and office suite - http://calligra.org
Kexi:
: A visual database apps builder - http://calligra.org/kexi
Qt Certified Specialist:
: http://www.linkedin.com/in/jstaniek

Re: Telemetry Policy

2017-08-16 Thread Christian Loosli

Am Mittwoch, 16. August 2017, 11:46:12 CEST schrieb Volker Krause:

> Valid point, I've added a statement to the policy asking distributors of our
> products to respect the rules too.

Thank you very much :) 

> I don't think we can make this a hard requirement (as in: you lose the right
> to distribute our software), in my understanding that would be conflicting
> with the freedom guaranteed by the GPL.

You indeed can't. And whilst I personally think that the GPL is a not so great 
license, better ones don't accept that either. And GPL is set, so that's 
hypothetical anyway. 

That's perfectly fine though, fortunately there are distributions to choose out 
of, so if one decides it would like to override privacy setting 
recommendations from upstream, I think users concerned with that kind of thing 
can and simply will switch distributions. 

> Regards,
> Volker

Thanks for your work, kind regards, 

Christian

Re: Telemetry Policy

2017-08-16 Thread Christian Loosli

Am Mittwoch, 16. August 2017, 00:33:02 CEST schrieb Valorie Zimmerman:

> I think the entire page might be enlightening to this discussion. I
> believe our analysis of needs should be more fine-grained, and that
> some parts of what we need can be "default on" especially for
> pre-release testing. For releases, we can provide an opt-out.

I'm afraid that at the very moment KDE starts transmitting my data, no matter 
what data, as opt-out, I'll opt out of supporting and using KDE products, and 
I assume a lot of people will do the same. 

This is, in my opinion, the exact opposites of our very principles and 
manifesto, and I would not jeapardize our reputation just to gather some data, 
to be honest. It would create reputational damage that is hard to fix  (people 
still remember the Unity amazon thing)

I do admit that I have very strong views when it comes to privacy and data 
usage though. 

> Other more sensitive data will need to be opt-in. I think it's a
> mistake to treat all the data we might want all in the same way.
> 
> Valorie

Kind regards, 

Christian

> On Sun, Aug 13, 2017 at 3:18 AM, Christian Loosli
> 
>  wrote:
> > Hi,
> > 
> > thank you very much for this work, sounds great!
> > 
> > Only point I have: maybe make sure that the opt-in / default settings are
> > not only mandatory for application developers, but also for packagers /
> > distributions.
> > 
> > Some distributions have rather questionable views on privacy and by
> > default
> > sent information to third parties, so I would feel much more safe if they
> > weren't allowed (in theory) to flick the switch in their package by
> > default to "on" either.
> > 
> > Kind regards,
> > 
> > Christian

Re: Telemetry Policy

2017-08-16 Thread Volker Krause

On Wednesday, 16 August 2017 09:33:02 CEST Valorie Zimmerman wrote:
> Hi all, Mozilla has done a lot of work on telemetry, and we might be
> able to use some of their findings. On this page:
> https://wiki.mozilla.org/Firefox/Data_Collection they break down the
> data they might possibly collect into four buckets - technical (such
> as crashes), user interaction, web activity, and sensitive (personal
> data).

without making it that explicit, we basically have the same four categories of 
data too, and explicitly exclude the use of category 3 and 4, ie user content/
activity and personal data, only technical and interaction data are allowed to 
be used (category 1 and 2).

> This bit might be relevant to our discussion: "Categories 1 & 2
> (Technical & Interaction data)
> Pre-Release & Release: Data may default on, provided the data is
> exclusively in these categories (it cannot be in any other category).
> In Release, an opt-out must be available for most types of Technical
> and Interaction data. "
> 
> I think the entire page might be enlightening to this discussion. I
> believe our analysis of needs should be more fine-grained, and that
> some parts of what we need can be "default on" especially for
> pre-release testing. For releases, we can provide an opt-out.
> 
> Other more sensitive data will need to be opt-in. I think it's a
> mistake to treat all the data we might want all in the same way.

This again brings up opt-out, which so far doesn't seem to have a chance for 
consensus. Can we defer this to when we have some more experience with the 
opt-in approach and how much participation we get with that? Or are people 
feeling this would too strongly limit what they are allowed to do in their 
applications? 

Seeing yesterday's blog from the Krita team (https://akapust1n.github.io/
2017-08-15-sixth-blog-gsoc-2017/), I'd particularly be interested in their 
view on this.

Regards,
Volker

> On Sun, Aug 13, 2017 at 3:18 AM, Christian Loosli
> 
>  wrote:
> > Hi,
> > 
> > thank you very much for this work, sounds great!
> > 
> > Only point I have: maybe make sure that the opt-in / default settings are
> > not only mandatory for application developers, but also for packagers /
> > distributions.
> > 
> > Some distributions have rather questionable views on privacy and by
> > default
> > sent information to third parties, so I would feel much more safe if they
> > weren't allowed (in theory) to flick the switch in their package by
> > default to "on" either.
> > 
> > Kind regards,
> > 
> > Christian

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-16 Thread Ben Cooksley

On Wed, Aug 16, 2017 at 9:06 PM, Volker Krause  wrote:
> On Wednesday, 16 August 2017 10:21:11 CEST Ben Cooksley wrote:
>> On Mon, Aug 14, 2017 at 11:40 PM, Volker Krause  wrote:
>> > I agree on the proposed wording changes, so focusing on your technical
>> > points below.
>> >
>> > On Monday, 14 August 2017 11:53:17 CEST Ben Cooksley wrote:
>> >> I've got two technical notes here:
>> >>
>> >> 1) All products should fetch details on where to submit telemetry data
>> >> from an online configuration file similar to
>> >> https://autoconfig.kde.org/ocs/providers.xml
>> >>
>> >> This would give us the capacity to version the telemetry server api,
>> >> and potentially even "kill" telemetry submissions from older
>> >> application versions if needed.
>> >>
>> >> 2) No software product should use the QNetworkAccessManager family of
>> >> classes due to known defects in it's operation within some versions of
>> >> Qt which cause infrastructure problems.
>> >
>> > The current implementation uses QNAM, but actually has code to handle HTTP
>> > redirects correctly (with unit test coverage), I assume that's the issue
>> > you are referring to? This also has been tested all the way back to Qt4.8
>> > as part of the existing deployment in GammaRay.
>>
>> That's one of the considerations yes. I'm hopeful that nothing else in
>> it will be found to be broken behaviour wise but have much more faith
>> in KIO here.
>>
>> > I don't mind adding the extra indirection with the configuration file,
>> > although just from the XML I don't see yet what that would provide beyond
>> > HTTP redirects. Are there certain information (e.g. the app version)
>> > passed already as part of the request for the configuration file? Or can
>> > there be conditional aspects not currently present in the above example?
>>
>> The extra indirection is basically to give us the option to shift the
>> endpoint elsewhere at some point without having to keep the old one
>> alive even as a redirect.
>
> Isn't that just shifting the requirement for the "stable" endpoint to the
> configuration one? But if that's easier we can of course add that. Are there
> any formats/standards you have in mind for this, or any parameters the GET
> request should contain?

Yes it does, but it transfers control over where the data is
ultimately submitted from something which gets hardcoded in
applications to something which is more under our control.

In terms of parameters I was thinking that the file should be totally
static - the format is up to you to define (that XML was just an
example of something we already have).

>
>> I'm also concerned that we could potentially run into issues if the
>> system doesn't do any GET requests. From what I recall unless the
>> server and client support a specific RFC then redirecting POST
>> requests isn't something one can rely on here (your code might handle
>> this properly, I certainly wouldn't trust QNAM to do so given their
>> stance on optional behaviour in HTTP RFCs)
>
> Correct, QNAM doesn't support POST redirects itself. But since we deal with
> redirects ourselves anyway, that's not really an issue. On the server I
> haven't run into issues yet, even the super primitive HTTP test server built
> into PHP can handle it. POST redirects aren't particularly elegant though, as
> you are sending the payload multiple times. So the extra GET might be a better
> solution anyway.

*nod*

>
> Regards,
> Volker

Thanks,
Ben

Re: Telemetry Policy

2017-08-16 Thread Volker Krause

On Sunday, 13 August 2017 12:18:16 CEST Christian Loosli wrote:
> Hi,
> 
> thank you very much for this work, sounds great!
> 
> Only point I have: maybe make sure that the opt-in / default settings are
> not only mandatory for application developers, but also for packagers /
> distributions.
> 
> Some distributions have rather questionable views on privacy and by default
> sent information to third parties, so I would feel much more safe if they
> weren't allowed (in theory) to flick the switch in their package by default
> to "on" either.

Valid point, I've added a statement to the policy asking distributors of our 
products to respect the rules too.

I don't think we can make this a hard requirement (as in: you lose the right 
to distribute our software), in my understanding that would be conflicting with 
the freedom guaranteed by the GPL.

Regards,
Volker

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-16 Thread Mirko Boehm - KDE

Hi,

before this gets completely out of hand: The cited German data protection 
regulations are often misunderstood, even by people that pose as experts. They 
are also often (mis-)used as killer arguments to support political or personal 
opinions. If we start collecting telemetry data, we should get an assessment by 
a lawyer (!) that the way we handle the data is correct. However, it can 
certainly be done correctly and in a way that protects individual privacy and 
supports the improvement of our software.

Technical argument: If IP addresses are a concern, would it be an option to run 
them through a one-way hash function on the client side before submitting the 
data?

Best,

Mirko.

On Wed, Aug 16, 2017 at 11:08 AM Volker Krause mailto:vkra...@kde.org>> wrote:
On Wednesday, 16 August 2017 10:21:11 CEST Ben Cooksley wrote:
> On Mon, Aug 14, 2017 at 11:40 PM, Volker Krause  > wrote:
> > I agree on the proposed wording changes, so focusing on your technical
> > points below.
> >
> > On Monday, 14 August 2017 11:53:17 CEST Ben Cooksley wrote:
> >> I've got two technical notes here:
> >>
> >> 1) All products should fetch details on where to submit telemetry data
> >> from an online configuration file similar to
> >> https://autoconfig.kde.org/ocs/providers.xml 
> >> 
> >>
> >> This would give us the capacity to version the telemetry server api,
> >> and potentially even "kill" telemetry submissions from older
> >> application versions if needed.
> >>
> >> 2) No software product should use the QNetworkAccessManager family of
> >> classes due to known defects in it's operation within some versions of
> >> Qt which cause infrastructure problems.
> >
> > The current implementation uses QNAM, but actually has code to handle HTTP
> > redirects correctly (with unit test coverage), I assume that's the issue
> > you are referring to? This also has been tested all the way back to Qt4.8
> > as part of the existing deployment in GammaRay.
>
> That's one of the considerations yes. I'm hopeful that nothing else in
> it will be found to be broken behaviour wise but have much more faith
> in KIO here.
>
> > I don't mind adding the extra indirection with the configuration file,
> > although just from the XML I don't see yet what that would provide beyond
> > HTTP redirects. Are there certain information (e.g. the app version)
> > passed already as part of the request for the configuration file? Or can
> > there be conditional aspects not currently present in the above example?
>
> The extra indirection is basically to give us the option to shift the
> endpoint elsewhere at some point without having to keep the old one
> alive even as a redirect.

Isn't that just shifting the requirement for the "stable" endpoint to the
configuration one? But if that's easier we can of course add that. Are there
any formats/standards you have in mind for this, or any parameters the GET
request should contain?

> I'm also concerned that we could potentially run into issues if the
> system doesn't do any GET requests. From what I recall unless the
> server and client support a specific RFC then redirecting POST
> requests isn't something one can rely on here (your code might handle
> this properly, I certainly wouldn't trust QNAM to do so given their
> stance on optional behaviour in HTTP RFCs)

Correct, QNAM doesn't support POST redirects itself. But since we deal with
redirects ourselves anyway, that's not really an issue. On the server I
haven't run into issues yet, even the super primitive HTTP test server built
into PHP can handle it. POST redirects aren't particularly elegant though, as
you are sending the payload multiple times. So the extra GET might be a better
solution anyway.

Regards,
Volker

--
Mirko Boehm | mi...@kde.org | KDE e.V.
FSFE Fellowship Representative, FSFE Team Germany
Qt Certified Specialist and Trainer
Request a meeting: https://doodle.com/mirkoboehm

signature.asc
Description: Message signed with OpenPGP

Re: Telemetry Policy

2017-08-16 Thread Volker Krause

On Wednesday, 16 August 2017 10:21:11 CEST Ben Cooksley wrote:
> On Mon, Aug 14, 2017 at 11:40 PM, Volker Krause  wrote:
> > I agree on the proposed wording changes, so focusing on your technical
> > points below.
> > 
> > On Monday, 14 August 2017 11:53:17 CEST Ben Cooksley wrote:
> >> I've got two technical notes here:
> >> 
> >> 1) All products should fetch details on where to submit telemetry data
> >> from an online configuration file similar to
> >> https://autoconfig.kde.org/ocs/providers.xml
> >> 
> >> This would give us the capacity to version the telemetry server api,
> >> and potentially even "kill" telemetry submissions from older
> >> application versions if needed.
> >> 
> >> 2) No software product should use the QNetworkAccessManager family of
> >> classes due to known defects in it's operation within some versions of
> >> Qt which cause infrastructure problems.
> > 
> > The current implementation uses QNAM, but actually has code to handle HTTP
> > redirects correctly (with unit test coverage), I assume that's the issue
> > you are referring to? This also has been tested all the way back to Qt4.8
> > as part of the existing deployment in GammaRay.
> 
> That's one of the considerations yes. I'm hopeful that nothing else in
> it will be found to be broken behaviour wise but have much more faith
> in KIO here.
> 
> > I don't mind adding the extra indirection with the configuration file,
> > although just from the XML I don't see yet what that would provide beyond
> > HTTP redirects. Are there certain information (e.g. the app version)
> > passed already as part of the request for the configuration file? Or can
> > there be conditional aspects not currently present in the above example?
> 
> The extra indirection is basically to give us the option to shift the
> endpoint elsewhere at some point without having to keep the old one
> alive even as a redirect.

Isn't that just shifting the requirement for the "stable" endpoint to the 
configuration one? But if that's easier we can of course add that. Are there 
any formats/standards you have in mind for this, or any parameters the GET 
request should contain?

> I'm also concerned that we could potentially run into issues if the
> system doesn't do any GET requests. From what I recall unless the
> server and client support a specific RFC then redirecting POST
> requests isn't something one can rely on here (your code might handle
> this properly, I certainly wouldn't trust QNAM to do so given their
> stance on optional behaviour in HTTP RFCs)

Correct, QNAM doesn't support POST redirects itself. But since we deal with 
redirects ourselves anyway, that's not really an issue. On the server I 
haven't run into issues yet, even the super primitive HTTP test server built 
into PHP can handle it. POST redirects aren't particularly elegant though, as 
you are sending the payload multiple times. So the extra GET might be a better 
solution anyway.

Regards,
Volker

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-16 Thread Ben Cooksley

On Mon, Aug 14, 2017 at 11:40 PM, Volker Krause  wrote:
> I agree on the proposed wording changes, so focusing on your technical points
> below.
>
> On Monday, 14 August 2017 11:53:17 CEST Ben Cooksley wrote:
>> I've got two technical notes here:
>>
>> 1) All products should fetch details on where to submit telemetry data
>> from an online configuration file similar to
>> https://autoconfig.kde.org/ocs/providers.xml
>>
>> This would give us the capacity to version the telemetry server api,
>> and potentially even "kill" telemetry submissions from older
>> application versions if needed.
>>
>> 2) No software product should use the QNetworkAccessManager family of
>> classes due to known defects in it's operation within some versions of
>> Qt which cause infrastructure problems.
>
> The current implementation uses QNAM, but actually has code to handle HTTP
> redirects correctly (with unit test coverage), I assume that's the issue you
> are referring to? This also has been tested all the way back to Qt4.8 as part
> of the existing deployment in GammaRay.

That's one of the considerations yes. I'm hopeful that nothing else in
it will be found to be broken behaviour wise but have much more faith
in KIO here.

>
> I don't mind adding the extra indirection with the configuration file, 
> although
> just from the XML I don't see yet what that would provide beyond HTTP
> redirects. Are there certain information (e.g. the app version) passed already
> as part of the request for the configuration file? Or can there be conditional
> aspects not currently present in the above example?

The extra indirection is basically to give us the option to shift the
endpoint elsewhere at some point without having to keep the old one
alive even as a redirect.

I'm also concerned that we could potentially run into issues if the
system doesn't do any GET requests. From what I recall unless the
server and client support a specific RFC then redirecting POST
requests isn't something one can rely on here (your code might handle
this properly, I certainly wouldn't trust QNAM to do so given their
stance on optional behaviour in HTTP RFCs)

>
> Regards,
> Volker

Thanks,
Ben

Re: Telemetry Policy

2017-08-16 Thread Valorie Zimmerman

Hi all, Mozilla has done a lot of work on telemetry, and we might be
able to use some of their findings. On this page:
https://wiki.mozilla.org/Firefox/Data_Collection they break down the
data they might possibly collect into four buckets - technical (such
as crashes), user interaction, web activity, and sensitive (personal
data).

This bit might be relevant to our discussion: "Categories 1 & 2
(Technical & Interaction data)
Pre-Release & Release: Data may default on, provided the data is
exclusively in these categories (it cannot be in any other category).
In Release, an opt-out must be available for most types of Technical
and Interaction data. "

I think the entire page might be enlightening to this discussion. I
believe our analysis of needs should be more fine-grained, and that
some parts of what we need can be "default on" especially for
pre-release testing. For releases, we can provide an opt-out.

Other more sensitive data will need to be opt-in. I think it's a
mistake to treat all the data we might want all in the same way.

Valorie

On Sun, Aug 13, 2017 at 3:18 AM, Christian Loosli
 wrote:
> Hi,
>
> thank you very much for this work, sounds great!
>
> Only point I have: maybe make sure that the opt-in / default settings are not
> only mandatory for application developers, but also for packagers /
> distributions.
>
> Some distributions have rather questionable views on privacy and by default
> sent information to third parties, so I would feel much more safe if they
> weren't allowed (in theory) to flick the switch in their package by default to
> "on" either.
>
> Kind regards,
>
> Christian

-- 
http://about.me/valoriez

Re: Telemetry Policy

2017-08-16 Thread Christian Loosli

Hi, 

thank you very much for this work, sounds great! 

Only point I have: maybe make sure that the opt-in / default settings are not 
only mandatory for application developers, but also for packagers / 
distributions. 

Some distributions have rather questionable views on privacy and by default 
sent information to third parties, so I would feel much more safe if they 
weren't allowed (in theory) to flick the switch in their package by default to 
"on" either.

Kind regards, 

Christian

Re: Telemetry Policy

2017-08-15 Thread Volker Krause

On Monday, 14 August 2017 22:26:36 CEST Thomas Pfeiffer wrote:
> On Sonntag, 13. August 2017 11:47:28 CEST Volker Krause wrote:
> > ## Minimalism
> > 
> > We only track the bare minimum of data necessary to answer specific
> > questions, we do not collect data preemptively or for exploratory
> > research.
> > In particular, this means:
> > - collected data  must have a clear purpose
> 
> While from a privacy perspective this certainly makes sense, with my user
> researcher hat on I'm worried that this might severely limit the usefulness
> of the whole operation, at least if changes to what is being tracked can
> only be made with each new major release of an application.
> 
> Psychologists usually collect more information in their studies than they
> would need strictly to test their hypotheses. We don't do that because we
> just want to collect data or to sell them or whatever.
> No, we collect them because in reality, things are hardly ever as clear-cut
> as we had hypothesized. Our hypotheses are often based on correlations
> between two variables, but in reality, more often than not there is some
> other variable which we had not thought of before that affects one or both
> of the variables we're interested in, and thereby distorts the data.
> 
> Now if we only collected the data that we had a-priori hypotheses about,
> that would mean that after every study, we'd have to go back to the drawing
> board and define which variables to collect next time. This would make
> research both slow and very expensive. By collecting additional data,
> however, we have the chance to run additional exploratory tests after the
> fact, and uncover new hypotheses that we can then test in the next study.
> 
> In the case of KUserFeedback, fortunately cost is not really an issue
> because we don't pay our users for providing the data. Time, on the other
> hand, _is_ an issue. If we strictly only collect data if a hypothesis
> exists about them, that means the following:
> 
> T0: The day of a KDE Applications release, I have a hypothesis about a
> causal link between two variables regarding the usage of KAlgebra.
> 
> T+1day: I use my incredible charming skills to coerce Aleix into
> implementing triggers for collecting data about these two variables.
> 
> T+4 months: The next release ships these collection triggers, data comes in.
> 
> T+5 months: After one month's worth of data are collected, I analyze them.
> the numbers look weird, something is odd. Damn, seems like some other
> variable is in play there. I have a few candidates in mind, some are more
> likely to be the culprit than others.
> 
> T+6 months: I convince Aleix to implement triggers for all the candidates.
> He's reluctant because that seems to go against the minimalism rule, but I
> convince him that I'm really unsure and don't want to risk another release
> cycle only to find out we had tested the wrong variables
> 
> T+8 months: The release with the new variables is out.
> 
> T+9 months: After a month's worth of data, I run my analysis again. Eureka!
> I've finally found my causal link!
> 
> T+10 months: We come up with an improvement to KAlgebra based on the link
> we've found, and it gets implemented.
> 
> T+12 months: A year after I formulated my first hypothesis, the fruits of
> the whole endeavor get into users' hands.
> 
> And this scenario does not even take into account that it may take months
> until our software reaches the big chunk of users who are on "stable
> distros".
> 
> So, long story short: While I agree that we should not just wildly collect
> everything we can, being able to start measuring variables only on the next
> release after a concrete hypothesis has been formulated about them could
> really slow us down.
> 
> Is there any possible way to mitigate this issue?

The latency is indeed a very valid concern, and we can't even estimate this 
properly yet (deployment latency is one of the first things to measure with 
telemetry IMHO). Expecting anything below several months is way too optimistic 
I think.

More aggressive preemptive tracking might avoid one cycle in your above 
example, but only if you actually manage to think about everything you will 
need in the end.

So to have the complete picture, what data would you want to collect if the 
policy wouldn't restrict you to purpose-bound minimalism? Having a few 
examples would make it easier to tweak the balance here I think.

Also note that if we would publish and freely license the raw data, any 
exploratory research on that would still be possible, even if that wasn't the 
original purpose of the data collection.

Technically there are of course ways to address all this, for example by data 
collection scripts provided by the server and executed by a KUserFeedback 
application-side runtime. That's actually how this started, based on Björn's 
initial wishlist, but I think it's clear why we didn't end up there :)

Regards,
Volker


signature.asc
Description: This is a digital

Re: Telemetry Policy

2017-08-15 Thread Volker Krause

On Monday, 14 August 2017 22:40:28 CEST Ingo Klöcker wrote:
> On Monday 14 August 2017 19:28:06 Volker Krause wrote:
> > On Monday, 14 August 2017 14:16:12 CEST Bhushan Shah wrote:
> > > Can we have policy on how long we can store data? It's just random
> > > idea but I think it makes sense to tell users that after X period
> > > of time your data will be invalidated?
> > > 
> > > This gives the "part-solution" to problem where user wants to delete
> > > their shared data.
> > 
> > Good point. I'm unsure on what to pick as a suitable timeframe though.
> > It's hard to give a specific time right now, we don't know yet how
> > quickly updates of our software are deployed, which is what is going
> > to determine the latency of getting the data we want. For that
> > question alone we are looking at years I think. Maybe this could be
> > worded as "data is only kept as long as the purpose of the data
> > collection hasn't been achieved yet", ie. as soon as we have the
> > answer we were looking for we delete the raw data.
> 
> For most purposes (e.g. which parts of the software are used how often)
> it should be possible to aggregate the raw data monthly and then throw
> away the raw data.
> 
> > The bigger problem however is that this conflicts with publishing the
> > data under a free license. At this point we lose any control to
> > enforce data retention limits.
> 
> With respect to the considerations to make the collected raw data public
> I ask you to contact a data protection officer
> (Datenschutzbeauftragte/r) to get her/his opinion. Quoting
> https://en.wikipedia.org/wiki/General_Data_Protection_Regulation: "Valid
> consent must be explicit for data collected and the purposes data is
> used for (Article 7; defined in Article 4)." Since you cannot state the
> purposes the data is used for (because once made public it could be used
> for any purpose), I cannot see how you could get the users' consent for
> this.

That is true, as long as we deal with personal data. When we discussed this 
for the deployment in GammaRay regarding GDPR compliance we came to conclusion 
that the collected data is not personal data, which makes this considerably 
easier. 

For illustration, the following JSON data is what a random GammaRay instance 
on this machine would submit right now if I would opt-in to the maximum 
telemetry level:

{
"applicationVersion": {  "value": "2.8.50"  },
"compiler": { "type": "GCC",  "version": "7.1" },
"opengl": {
"glslVersion": "1.30",
"renderer": "Haswell Mobile ",
"type": "GL",
"vendor": "Intel",
"vendorVersion": "Mesa 17.1.4",
"version": "3.0"
},
"platform": {
"os": "linux",
"version": "opensuse-tumbleweed"
},
"qtVersion": { "value": "5.9.2" },
"startCount": { "value": 34 },
"toolRatio": {
"objectinspector": { "property": 0.7619047619047619 },
"quickinspector": { "property": 0.23809523809523808 }
},
"usageTime": { "value": 12113  }
}

The server would add a timestamp to that. That's also the level of detail we 
are looking at for telemetry in KDE I think.

The policy kinda implies that we do not want to track anything that common 
sense or laws/regulations would classify as personal data, I'll make that 
explicit to be sure.

The only personal data item we get in touch then is the IP address I think, 
therefore the early separation from telemetry data is crucial. Then the 
telemetry data is just "non-personal" data, and GDPR etc wouldn't apply (in my 
understanding, IANAL).

Regards,
Volker

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-14 Thread Ingo Klöcker

On Sunday 13 August 2017 11:47:28 Volker Krause wrote:
> Hi,
> 
> during the KUserFeedback BoF at Akademy there was quite some interest
> in collecting telemetry data in KDE applications. But before actually
> implementing that we agreed to define the rules under which we would
> want to do that. I've tried to put the input we collected during
> Akademy into proper wording below. What do you think? Did I miss
> anything?

Since my other messages in this thread might give a wrong impression, 
I'd like to clarify that I like Volker's original draft a lot. In my 
opinion, it's very well thought out and most likely fully compliant with 
European and German law, but IANAL.

Regards,
Ingo

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-14 Thread Ben Cooksley

On Tue, Aug 15, 2017 at 8:11 AM, Ingo Klöcker  wrote:
> On Monday 14 August 2017 21:53:17 Ben Cooksley wrote:
>> On Sun, Aug 13, 2017 at 9:47 PM, Volker Krause 
> wrote:
>> > Hi,
>>
>> Hi Volker,
>>
>> > during the KUserFeedback BoF at Akademy there was quite some
>> > interest in collecting telemetry data in KDE applications. But
>> > before actually implementing that we agreed to define the rules
>> > under which we would want to do that. I've tried to put the input
>> > we collected during Akademy into proper wording below. What do you
>> > think? Did I miss anything?
>> >
>> > Regards,
>> > Volker
>> >
>> >
>> > # Telemetry Policy Draft
>> >
>> > Application telemetry data can be a valuable tool for tailoring our
>> > products to the needs of our users. The following rules define how
>> > KDE collects and uses such application telemetry data. As privacy
>> > is of utmost importance to us, the general rule of thumb is to err
>> > on the side of caution here. Privacy always trumps any need for
>> > telemetry data, no matter how legitimate.
>> >
>> > These rules apply to all products released by KDE.
>> >
>> > ## Transparency
>> >
>> > We provide detailed information about the data that is going to be
>> > shared, in a way that:
>> > - is easy to understand
>> > - is precise and complete
>> > - is available locally without network connectivity
>> >
>> > Any changes or additions to the telemetry functionality of an
>> > application will be highlighted in the corresponding release
>> > announcement.
>> >
>> > ## Control
>> >
>> > We give the user full control over what data they want to share with
>> > KDE. In particular:
>> > - application telemetry is always opt-in, that is off by default
>> > - application telemetry settings can be changed at any time, and are
>> > provided as prominent in the application interface as other
>> > application settings - applications honor system-wide telemetry
>> > settings where they exist (global "kill switch")
>> > - we provide detailed documentation about how to control the
>> > application telemetry system
>> >
>> > In order to ensure control over the data after it has been shared
>> > with KDE, applications will only transmit this data to KDE servers,
>> > that is servers under the full control of the KDE sysadmin team.
>> >
>> > We will provide a designated contact point for users who have
>> > concerns about the data they have shared with KDE. While we are
>> > willing to delete data a user no longer wants to have shared, it
>> > should be understood that the below rules are designed to make
>> > identification of data of a specific user impossible, and thus a
>> > deletion request impractical.
>>
>> Can we change "impractical" to "effectively impossible" here please?
>>
>> > ## Anonymity
>> >
>> > We do not transmit data that could be used to identify a specific
>> > user. In particular:
>> > - we will not use any unique device, installation or user id
>> > - data is stripped of any unnecessary detail and downsampled
>> > appropriately before sharing to avoid fingerprinting
>> > - network addresses (which are exposed inevitably as part of the
>> > data
>> > transmission) are not stored together with the telemetry data, and
>> > must only be stored or used to the extend necessary for abuse
>> > counter-measures
>> I'm wary that people might jump on the network addresses bit here.
>>
>> Can we please mention that all records that contain network addresses
>> and other similar information would be stored in such a form that they
>> could not be associated with telemetry records.
>>
>> In terms of the logs - as there are other uses for them, i'd prefer if
>> we widened that to also allow them to be kept to allow us to maintain
>> the proper and effective operation of the telemetry system and other
>> associated services. The time we retain those logs should also be at
>> our complete and total discretion and if need be should be
>> indefinite.
>
> I'm pretty sure that this would be a violation of the European General
> Data Protection Regulation.
>
> In Germany IP addresses are considered personal data (by rulings of the
> German constitutional court). Therefore, IP ad

Re: Telemetry Policy

2017-08-14 Thread Ingo Klöcker

On Monday 14 August 2017 19:28:06 Volker Krause wrote:
> On Monday, 14 August 2017 14:16:12 CEST Bhushan Shah wrote:
> > Can we have policy on how long we can store data? It's just random
> > idea but I think it makes sense to tell users that after X period
> > of time your data will be invalidated?
> > 
> > This gives the "part-solution" to problem where user wants to delete
> > their shared data.
> 
> Good point. I'm unsure on what to pick as a suitable timeframe though.
> It's hard to give a specific time right now, we don't know yet how
> quickly updates of our software are deployed, which is what is going
> to determine the latency of getting the data we want. For that
> question alone we are looking at years I think. Maybe this could be
> worded as "data is only kept as long as the purpose of the data
> collection hasn't been achieved yet", ie. as soon as we have the
> answer we were looking for we delete the raw data.

For most purposes (e.g. which parts of the software are used how often) 
it should be possible to aggregate the raw data monthly and then throw 
away the raw data.

> The bigger problem however is that this conflicts with publishing the
> data under a free license. At this point we lose any control to
> enforce data retention limits.

With respect to the considerations to make the collected raw data public 
I ask you to contact a data protection officer 
(Datenschutzbeauftragte/r) to get her/his opinion. Quoting 
https://en.wikipedia.org/wiki/General_Data_Protection_Regulation: "Valid 
consent must be explicit for data collected and the purposes data is 
used for (Article 7; defined in Article 4)." Since you cannot state the 
purposes the data is used for (because once made public it could be used 
for any purpose), I cannot see how you could get the users' consent for 
this.

Regards,
Ingo

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-14 Thread Ingo Klöcker

On Monday 14 August 2017 22:26:36 Thomas Pfeiffer wrote:
> On Sonntag, 13. August 2017 11:47:28 CEST Volker Krause wrote:
> > ## Minimalism
> > 
> > We only track the bare minimum of data necessary to answer specific
> > questions, we do not collect data preemptively or for exploratory
> > research. In particular, this means:
> > - collected data  must have a clear purpose
> 
> While from a privacy perspective this certainly makes sense, with my
> user researcher hat on I'm worried that this might severely limit the
> usefulness of the whole operation, at least if changes to what is
> being tracked can only be made with each new major release of an
> application.
> 
[snip]
> 
> So, long story short: While I agree that we should not just wildly
> collect everything we can, being able to start measuring variables
> only on the next release after a concrete hypothesis has been
> formulated about them could really slow us down.
> 
> Is there any possible way to mitigate this issue?

I suggest the same to you as I did to Volker: Contact a data protection 
officer, e.g. the one assigned to the company you work for, and talk 
with her about it.


Regards,
Ingo


signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-14 Thread Thomas Pfeiffer

On Sonntag, 13. August 2017 11:47:28 CEST Volker Krause wrote:

> ## Minimalism
> 
> We only track the bare minimum of data necessary to answer specific
> questions, we do not collect data preemptively or for exploratory research.
> In particular, this means:
> - collected data  must have a clear purpose

While from a privacy perspective this certainly makes sense, with my user 
researcher hat on I'm worried that this might severely limit the usefulness of 
the whole operation, at least if changes to what is being tracked can only be 
made with each new major release of an application.

Psychologists usually collect more information in their studies than they 
would need strictly to test their hypotheses. We don't do that because we just 
want to collect data or to sell them or whatever.
No, we collect them because in reality, things are hardly ever as clear-cut as 
we had hypothesized. Our hypotheses are often based on correlations between 
two variables, but in reality, more often than not there is some other 
variable which we had not thought of before that affects one or both of the 
variables we're interested in, and thereby distorts the data.

Now if we only collected the data that we had a-priori hypotheses about, that 
would mean that after every study, we'd have to go back to the drawing board 
and define which variables to collect next time. This would make research both 
slow and very expensive. By collecting additional data, however, we have the 
chance to run additional exploratory tests after the fact, and uncover new 
hypotheses that we can then test in the next study.

In the case of KUserFeedback, fortunately cost is not really an issue because 
we don't pay our users for providing the data. Time, on the other hand, _is_ 
an issue. If we strictly only collect data if a hypothesis exists about them, 
that means the following:

T0: The day of a KDE Applications release, I have a hypothesis about a causal 
link between two variables regarding the usage of KAlgebra.

T+1day: I use my incredible charming skills to coerce Aleix into implementing 
triggers for collecting data about these two variables.

T+4 months: The next release ships these collection triggers, data comes in.

T+5 months: After one month's worth of data are collected, I analyze them. the 
numbers look weird, something is odd. Damn, seems like some other variable is 
in play there. I have a few candidates in mind, some are more likely to be the 
culprit than others.

T+6 months: I convince Aleix to implement triggers for all the candidates. 
He's reluctant because that seems to go against the minimalism rule, but I 
convince him that I'm really unsure and don't want to risk another release 
cycle only to find out we had tested the wrong variables

T+8 months: The release with the new variables is out.

T+9 months: After a month's worth of data, I run my analysis again. Eureka! 
I've finally found my causal link!

T+10 months: We come up with an improvement to KAlgebra based on the link 
we've found, and it gets implemented.

T+12 months: A year after I formulated my first hypothesis, the fruits of the 
whole endeavor get into users' hands.

And this scenario does not even take into account that it may take months 
until our software reaches the big chunk of users who are on "stable distros".

So, long story short: While I agree that we should not just wildly collect 
everything we can, being able to start measuring variables only on the next 
release after a concrete hypothesis has been formulated about them could 
really slow us down.

Is there any possible way to mitigate this issue?

Cheers,
Thomas

Re: Telemetry Policy

2017-08-14 Thread Ingo Klöcker

On Monday 14 August 2017 21:53:17 Ben Cooksley wrote:
> On Sun, Aug 13, 2017 at 9:47 PM, Volker Krause  
wrote:
> > Hi,
> 
> Hi Volker,
> 
> > during the KUserFeedback BoF at Akademy there was quite some
> > interest in collecting telemetry data in KDE applications. But
> > before actually implementing that we agreed to define the rules
> > under which we would want to do that. I've tried to put the input
> > we collected during Akademy into proper wording below. What do you
> > think? Did I miss anything?
> > 
> > Regards,
> > Volker
> > 
> > 
> > # Telemetry Policy Draft
> > 
> > Application telemetry data can be a valuable tool for tailoring our
> > products to the needs of our users. The following rules define how
> > KDE collects and uses such application telemetry data. As privacy
> > is of utmost importance to us, the general rule of thumb is to err
> > on the side of caution here. Privacy always trumps any need for
> > telemetry data, no matter how legitimate.
> > 
> > These rules apply to all products released by KDE.
> > 
> > ## Transparency
> > 
> > We provide detailed information about the data that is going to be
> > shared, in a way that:
> > - is easy to understand
> > - is precise and complete
> > - is available locally without network connectivity
> > 
> > Any changes or additions to the telemetry functionality of an
> > application will be highlighted in the corresponding release
> > announcement.
> > 
> > ## Control
> > 
> > We give the user full control over what data they want to share with
> > KDE. In particular:
> > - application telemetry is always opt-in, that is off by default
> > - application telemetry settings can be changed at any time, and are
> > provided as prominent in the application interface as other
> > application settings - applications honor system-wide telemetry
> > settings where they exist (global "kill switch")
> > - we provide detailed documentation about how to control the
> > application telemetry system
> > 
> > In order to ensure control over the data after it has been shared
> > with KDE, applications will only transmit this data to KDE servers,
> > that is servers under the full control of the KDE sysadmin team.
> > 
> > We will provide a designated contact point for users who have
> > concerns about the data they have shared with KDE. While we are
> > willing to delete data a user no longer wants to have shared, it
> > should be understood that the below rules are designed to make
> > identification of data of a specific user impossible, and thus a
> > deletion request impractical.
> 
> Can we change "impractical" to "effectively impossible" here please?
> 
> > ## Anonymity
> > 
> > We do not transmit data that could be used to identify a specific
> > user. In particular:
> > - we will not use any unique device, installation or user id
> > - data is stripped of any unnecessary detail and downsampled
> > appropriately before sharing to avoid fingerprinting
> > - network addresses (which are exposed inevitably as part of the
> > data
> > transmission) are not stored together with the telemetry data, and
> > must only be stored or used to the extend necessary for abuse
> > counter-measures
> I'm wary that people might jump on the network addresses bit here.
> 
> Can we please mention that all records that contain network addresses
> and other similar information would be stored in such a form that they
> could not be associated with telemetry records.
> 
> In terms of the logs - as there are other uses for them, i'd prefer if
> we widened that to also allow them to be kept to allow us to maintain
> the proper and effective operation of the telemetry system and other
> associated services. The time we retain those logs should also be at
> our complete and total discretion and if need be should be
> indefinite.

I'm pretty sure that this would be a violation of the European General 
Data Protection Regulation.

In Germany IP addresses are considered personal data (by rulings of the 
German constitutional court). Therefore, IP addresses must be 
anonymized, e.g. by zeroing the last part of the quadruplet (see for 
example the anonymizeIp setting of Google Analytics), if they are used 
for anything other than maintaining the security of a service. Even if 
used for maintaining the security of a service they must not be stored 
longer than absolutely necessary. Storing IP addresses indefinitely or 
at least for a long period of time is the "wet dream" of all national 
law enforcement intelligence institutions -> Vorratsdatenspeicherung 
(data retention). Luckily, so far those dreams have been stalled by the 
German constitutional court. The German Minister of the Interior would 
be delighted if KDE would provide such data.


Regards,
Ingo


signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-14 Thread Volker Krause

On Sunday, 13 August 2017 11:47:28 CEST Volker Krause wrote:
> Hi,
> 
> during the KUserFeedback BoF at Akademy there was quite some interest in
> collecting telemetry data in KDE applications. But before actually
> implementing that we agreed to define the rules under which we would want to
> do that. I've tried to put the input we collected during Akademy into
> proper wording below. What do you think? Did I miss anything?
> 
> Regards,
> Volker
> 
> 
> # Telemetry Policy Draft

Added to the wiki, so we have version control:
https://community.kde.org/Policies/Telemetry_Policy

I've integrated the requirement for transport security suggested by Thomas, 
Ben's comments and Bhushan's idea of a global registry of telemetry-enabled 
applications. Please verify this reflects what you had in mind.

Still open policy questions (unless I missed something):
- do we want to mandate an audit log?
- regulations for licensing and publishing of the data
- should we mandate revocation support, and if so for how long after 
submission?
- should we have upper limits for data retention?

Not sure yet how to balance the conflict between limited data retention and 
revocation support on one side and publication/free licensing on the other 
side. 

The audit log looks easy to implement and has been requested before, so I 
guess there wouldn't be objections to adding that as a requirement?

Thanks for all the input so far!
Volker

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-14 Thread Volker Krause

On Monday, 14 August 2017 14:16:12 CEST Bhushan Shah wrote:
> Hello Volker,
> 
> First of all thanks for working on this topic, I've some comments I
> would like to add.
> 
> On Sun, Aug 13, 2017 at 11:47:28AM +0200, Volker Krause wrote:
> > ## Control
> > 
> > We give the user full control over what data they want to share with KDE.
> > In particular:
> > - application telemetry is always opt-in, that is off by default
> > - application telemetry settings can be changed at any time, and are
> > provided as prominent in the application interface as other application
> > settings - applications honor system-wide telemetry settings where they
> > exist (global "kill switch")
> > - we provide detailed documentation about how to control the application
> > telemetry system
> 
> This is kind of technical point but here it goes,
> 
> I think we should include the point in policy that, this data is stored
> and transferred in encrypted format only, on both user's machine and the
> KDE's server. This prevents Man in middle attacks and also prevents
> unwanted/unauthorized access to user's data by third party application
> on local machine.
> 
> So how I think this should happen is,
> 
> -> Application starts
> -> Writes encrypted data to storage file
> -> Encrypted data is transferred to KDE server
> -> They are decrypted on KDE server when needed
> 
> > We will provide a designated contact point for users who have concerns
> > about the data they have shared with KDE. While we are willing to delete
> > data a user no longer wants to have shared, it should be understood that
> > the below rules are designed to make identification of data of a specific
> > user impossible, and thus a deletion request impractical.
> 
> Can we have policy on how long we can store data? It's just random idea
> but I think it makes sense to tell users that after X period of time
> your data will be invalidated?
> 
> This gives the "part-solution" to problem where user wants to delete
> their shared data.

Good point. I'm unsure on what to pick as a suitable timeframe though. It's 
hard to give a specific time right now, we don't know yet how quickly updates 
of our software are deployed, which is what is going to determine the latency 
of getting the data we want. For that question alone we are looking at years I 
think. Maybe this could be worded as "data is only kept as long as the purpose 
of the data collection hasn't been achieved yet", ie. as soon as we have the 
answer we were looking for we delete the raw data.

The bigger problem however is that this conflicts with publishing the data 
under a free license. At this point we lose any control to enforce data 
retention limits.

> > ## Compliance
> > 
> > KDE only releases products capable of acquiring telemetry data if
> > compliance with these rules has been established by a public review on
> > [kde-core-devel| kde-community]@kde.org from at least two reviewers. The
> > review has to be repeated for every release if changes have been made to
> > how/what data is collected.
> 
> In addition to kde-community/kde-core-devel there should be public
> webpage at e.g. https://telemetry.kde.org where we explain what
> application collects what data and where it is used for clear
> transperancy.

A global registry of telemetry enabled KDE apps sounds like a good idea, 
basically documenting the review results in e.g. a wiki. This would also make 
it easier to spot problematic combinations or data tracked multiple times (say 
KDevelop tracking stuff the Katepart already tracks, and/or each of them 
tracking data that in combination might enable fingerprinting).

Regards,
Volker

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-14 Thread Martin Flöser

Am 2017-08-14 14:17, schrieb Volker Krause:

On Sunday, 13 August 2017 12:56:27 CEST Martin Flöser wrote:

Am 2017-08-13 11:47, schrieb Volker Krause:
> Hi,
>
> during the KUserFeedback BoF at Akademy there was quite some interest
> in
> collecting telemetry data in KDE applications. But before actually
> implementing that we agreed to define the rules under which we would
> want to
> do that. I've tried to put the input we collected during Akademy into
> proper
> wording below. What do you think? Did I miss anything?

To me it looks good!

I have some additional requests:
  * the collected data must be made available to the public (mostly
thinking of research institutes here)

This has come up before, not in the context of 3rd parties like 
research

organisations, but for transparency towards our users.

There is a practical limitation of making raw data available live, as 
that
would create a publicly readable and writable system, with similar 
abuse
potential as e.g. pastebin. But I don't think that is the requirement 
you have
in mind here, it's more about sharing the raw data after 
review/eventually,

right?

Yes. I certainly don't want to give public edit functionality.

In the currently envisioned setup anyone with a KDE contributor account 
would
have access, so the remaining questions would be about the 
practicalities and

processes to review and release the data to the general public I think.

  * data must be made available under a CC license (CC0?)

Interesting point, I hadn't thought about that yet :) Can we even 
license the
data, as we didn't create it? Do we need to ask our users to license 
their

telemetry contributions?

Yes we can license the data. We are building up a database, so the 
copyright as for databases applies. As you are German I reference the 
German Wikipedia article: https://de.wikipedia.org/wiki/Datenbankwerk

  * maybe allow the user to delete the dataset again (difficult as 
that

conflicts with making the data public and would require authentication
which is the opposite to anonymity).

As discussed on kde-core-devel a while ago, I think this would be 
doable
technically, without compromising anonymity. The server would generate 
a
unique unpredictable token for each submitted sample and return that to 
the
client. The client collects those and can use them as part of a 
deletion

request.

However, this does only work as long as we have full control over the 
data, we
can't recall data that has already been extracted from our systems. So 
I think
this conflicts with the first two requirements you mentioned. How do we 
want to

resolve that?

Yes I am aware that these are contradicting requirements. Of course it's 
not possible to delete after it's published, but maybe we have 
situations like a user submitted data and than things about it half an 
hour later and decided "no, I don't want to share". So if it's 
technically possible it would be nice to have.

Cheers
Martin

Regards,
Volker

> # Telemetry Policy Draft
>
> Application telemetry data can be a valuable tool for tailoring our
> products
> to the needs of our users. The following rules define how KDE collects
> and
> uses such application telemetry data. As privacy is of utmost
> importance to
> us, the general rule of thumb is to err on the side of caution here.
> Privacy
> always trumps any need for telemetry data, no matter how legitimate.
>
> These rules apply to all products released by KDE.
>
> ## Transparency
>
> We provide detailed information about the data that is going to be
> shared, in
> a way that:
> - is easy to understand
> - is precise and complete
> - is available locally without network connectivity
>
> Any changes or additions to the telemetry functionality of an
> application will
> be highlighted in the corresponding release announcement.
>
> ## Control
>
> We give the user full control over what data they want to share with
> KDE. In
> particular:
> - application telemetry is always opt-in, that is off by default
> - application telemetry settings can be changed at any time, and are
> provided
> as prominent in the application interface as other application settings
> - applications honor system-wide telemetry settings where they exist
> (global
> "kill switch")
> - we provide detailed documentation about how to control the
> application
> telemetry system
>
> In order to ensure control over the data after it has been shared with
> KDE,
> applications will only transmit this data to KDE servers, that is
> servers
> under the full control of the KDE sysadmin team.
>
> We will provide a designated contact point for users who have concerns
> about
> the data they have shared with KDE. While we are willing to de

Re: Telemetry Policy

2017-08-14 Thread Volker Krause

On Monday, 14 August 2017 14:49:18 CEST Bhushan Shah wrote:
> On Mon, Aug 14, 2017 at 02:32:46PM +0200, Thomas Baumgart wrote:
> > What does keeping the local data in encrypted form help here? The
> > application (KUserFeedback function?) must be able to display the
> > transferred data to the user upon his request, so it needs to decrypt it.
> > The key material must be known on both ends (application and KDE server),
> > so it can be treated as public since it must be delivered with the
> > applications source code. That does not make sense to me.
> 
> While you are true about the need to show decrypted data in the
> KUserFeedback applications, it is also important to note that not all
> the application/process running on user's computer is provided by the
> KDE.
> 
> For example,
> 
> I as a user trust KDE community to provide the telemetry data but
> might not trust the 3rd party developer whose (closed source) app I've
> installed, and I don't want them to read AND/OR write to the telemetry
> datastore without me knowing.

If that's the threat model, you have a far far bigger problem than access to 
the telemetry data. Those apps have full access to your data, your passwords, 
all your keyboard input (assuming X11), etc.

> I know that is extra hassle and one more layer of indirection but this
> ensures that only trusted application have read-write access to the
> data. Also I am not sure if this is actually technically possible
> without complicating things too much or not.. but it's something worth
> investigating.

I'd say it's not just a technical complication, it's conceptually impossible. 
Where would you store the encryption key if you have to assume local file 
system access of an untrusted process? I can only think of one safe place 
that's not either compromised or would violate the rules of the telemetry 
policy: the user's head. But I doubt we'd want to bother them with a password 
for the telemetry data.

That aside, this would also conflict with the audit log proposed by Martin S 
earlier in this thread I think.

> > Transporting the data in end-to-end encrypted form (TLS) to avoid MITM is
> > a
> > different story which I already mentioned.
> 
> Yeah, I read your message above, it's just it was related to my point so
> I repeated it here as well

Transport security is so obvious I hadn't even thought about mentioning it in 
the policy :) It has been implemented from the start already.

Regards,
Volker

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-14 Thread Bhushan Shah

Hello Thomas,

On Mon, Aug 14, 2017 at 02:32:46PM +0200, Thomas Baumgart wrote:
> What does keeping the local data in encrypted form help here? The application 
> (KUserFeedback function?) must be able to display the transferred data to the 
> user upon his request, so it needs to decrypt it. The key material must be 
> known on both ends (application and KDE server), so it can be treated as 
> public since it must be delivered with the applications source code. That 
> does 
> not make sense to me.

While you are true about the need to show decrypted data in the
KUserFeedback applications, it is also important to note that not all
the application/process running on user's computer is provided by the
KDE.

For example,

I as a user trust KDE community to provide the telemetry data but
might not trust the 3rd party developer whose (closed source) app I've
installed, and I don't want them to read AND/OR write to the telemetry
datastore without me knowing.

I know that is extra hassle and one more layer of indirection but this
ensures that only trusted application have read-write access to the
data. Also I am not sure if this is actually technically possible
without complicating things too much or not.. but it's something worth
investigating.

> Transporting the data in end-to-end encrypted form (TLS) to avoid MITM is a 
> different story which I already mentioned.

Yeah, I read your message above, it's just it was related to my point so
I repeated it here as well

Thanks

-- 
Bhushan Shah
http://blog.bshah.in
IRC Nick : bshah on Freenode
GPG key fingerprint : 0AAC 775B B643 7A8D 9AF7 A3AC FE07 8411 7FBC E11D

signature.asc
Description: PGP signature

Re: Telemetry Policy

2017-08-14 Thread Thomas Baumgart

Hi,

On Montag, 14. August 2017 17:46:12 CEST Bhushan Shah wrote:

> Hello Volker,
> 
> First of all thanks for working on this topic, I've some comments I
> would like to add.
> 
> On Sun, Aug 13, 2017 at 11:47:28AM +0200, Volker Krause wrote:
> > ## Control
> > 
> > We give the user full control over what data they want to share with KDE.
> > In particular:
> > - application telemetry is always opt-in, that is off by default
> > - application telemetry settings can be changed at any time, and are
> > provided as prominent in the application interface as other application
> > settings - applications honor system-wide telemetry settings where they
> > exist (global "kill switch")
> > - we provide detailed documentation about how to control the application
> > telemetry system
> 
> This is kind of technical point but here it goes,
> 
> I think we should include the point in policy that, this data is stored
> and transferred in encrypted format only, on both user's machine and the
> KDE's server. This prevents Man in middle attacks and also prevents
> unwanted/unauthorized access to user's data by third party application
> on local machine.
> 
> So how I think this should happen is,
> 
> -> Application starts
> -> Writes encrypted data to storage file
> -> Encrypted data is transferred to KDE server
> -> They are decrypted on KDE server when needed

What does keeping the local data in encrypted form help here? The application 
(KUserFeedback function?) must be able to display the transferred data to the 
user upon his request, so it needs to decrypt it. The key material must be 
known on both ends (application and KDE server), so it can be treated as 
public since it must be delivered with the applications source code. That does 
not make sense to me.

Why would we need to encipher something and protect it, when we tell the user 
providing the data is transparent and does not contain non-anonymized values?

Transporting the data in end-to-end encrypted form (TLS) to avoid MITM is a 
different story which I already mentioned.

> > We will provide a designated contact point for users who have concerns
> > about the data they have shared with KDE. While we are willing to delete
> > data a user no longer wants to have shared, it should be understood that
> > the below rules are designed to make identification of data of a specific
> > user impossible, and thus a deletion request impractical.
> 
> Can we have policy on how long we can store data? It's just random idea
> but I think it makes sense to tell users that after X period of time
> your data will be invalidated?
> 
> This gives the "part-solution" to problem where user wants to delete
> their shared data.
> 
> > ## Compliance
> > 
> > KDE only releases products capable of acquiring telemetry data if
> > compliance with these rules has been established by a public review on
> > [kde-core-devel| kde-community]@kde.org from at least two reviewers. The
> > review has to be repeated for every release if changes have been made to
> > how/what data is collected.
> 
> In addition to kde-community/kde-core-devel there should be public
> webpage at e.g. https://telemetry.kde.org where we explain what
> application collects what data and where it is used for clear
> transperancy.
> 
> I think that's all points I wanted to add.
> 
> Thanks

-- 

Regards

Thomas Baumgart

https://www.telegram.org/   Telegram, the better WhatsApp
-
Progress isn't made by early risers. It's made by lazy men
trying to find easier ways to do something. -- Robert Heinlein
-


signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-14 Thread Bhushan Shah

Hello Volker,

First of all thanks for working on this topic, I've some comments I
would like to add.

On Sun, Aug 13, 2017 at 11:47:28AM +0200, Volker Krause wrote:
> ## Control
> 
> We give the user full control over what data they want to share with KDE. In 
> particular:
> - application telemetry is always opt-in, that is off by default
> - application telemetry settings can be changed at any time, and are provided 
> as prominent in the application interface as other application settings
> - applications honor system-wide telemetry settings where they exist (global 
> "kill switch")
> - we provide detailed documentation about how to control the application 
> telemetry system

This is kind of technical point but here it goes,

I think we should include the point in policy that, this data is stored
and transferred in encrypted format only, on both user's machine and the
KDE's server. This prevents Man in middle attacks and also prevents
unwanted/unauthorized access to user's data by third party application
on local machine.

So how I think this should happen is,

-> Application starts
-> Writes encrypted data to storage file
-> Encrypted data is transferred to KDE server
-> They are decrypted on KDE server when needed

> We will provide a designated contact point for users who have concerns about 
> the data they have shared with KDE. While we are willing to delete data a 
> user 
> no longer wants to have shared, it should be understood that the below rules 
> are designed to make identification of data of a specific user impossible, 
> and 
> thus a deletion request impractical.

Can we have policy on how long we can store data? It's just random idea
but I think it makes sense to tell users that after X period of time
your data will be invalidated?

This gives the "part-solution" to problem where user wants to delete
their shared data.

> ## Compliance
> 
> KDE only releases products capable of acquiring telemetry data if compliance 
> with these rules has been established by a public review on [kde-core-devel|
> kde-community]@kde.org from at least two reviewers. The review has to be 
> repeated for every release if changes have been made to how/what data is 
> collected.

In addition to kde-community/kde-core-devel there should be public
webpage at e.g. https://telemetry.kde.org where we explain what
application collects what data and where it is used for clear
transperancy.

I think that's all points I wanted to add.

Thanks

-- 
Bhushan Shah
http://blog.bshah.in
IRC Nick : bshah on Freenode
GPG key fingerprint : 0AAC 775B B643 7A8D 9AF7 A3AC FE07 8411 7FBC E11D

signature.asc
Description: PGP signature

Re: Telemetry Policy

2017-08-14 Thread Volker Krause

On Sunday, 13 August 2017 12:56:27 CEST Martin Flöser wrote:
> Am 2017-08-13 11:47, schrieb Volker Krause:
> > Hi,
> > 
> > during the KUserFeedback BoF at Akademy there was quite some interest
> > in
> > collecting telemetry data in KDE applications. But before actually
> > implementing that we agreed to define the rules under which we would
> > want to
> > do that. I've tried to put the input we collected during Akademy into
> > proper
> > wording below. What do you think? Did I miss anything?
> 
> To me it looks good!
> 
> I have some additional requests:
>   * the collected data must be made available to the public (mostly
> thinking of research institutes here)

This has come up before, not in the context of 3rd parties like research 
organisations, but for transparency towards our users.

There is a practical limitation of making raw data available live, as that 
would create a publicly readable and writable system, with similar abuse 
potential as e.g. pastebin. But I don't think that is the requirement you have 
in mind here, it's more about sharing the raw data after review/eventually, 
right?

In the currently envisioned setup anyone with a KDE contributor account would 
have access, so the remaining questions would be about the practicalities and 
processes to review and release the data to the general public I think.

>   * data must be made available under a CC license (CC0?)

Interesting point, I hadn't thought about that yet :) Can we even license the 
data, as we didn't create it? Do we need to ask our users to license their 
telemetry contributions?

>   * maybe allow the user to delete the dataset again (difficult as that
> conflicts with making the data public and would require authentication
> which is the opposite to anonymity).

As discussed on kde-core-devel a while ago, I think this would be doable 
technically, without compromising anonymity. The server would generate a 
unique unpredictable token for each submitted sample and return that to the 
client. The client collects those and can use them as part of a deletion 
request.

However, this does only work as long as we have full control over the data, we 
can't recall data that has already been extracted from our systems. So I think 
this conflicts with the first two requirements you mentioned. How do we want to 
resolve that?

Regards,
Volker

> > # Telemetry Policy Draft
> > 
> > Application telemetry data can be a valuable tool for tailoring our
> > products
> > to the needs of our users. The following rules define how KDE collects
> > and
> > uses such application telemetry data. As privacy is of utmost
> > importance to
> > us, the general rule of thumb is to err on the side of caution here.
> > Privacy
> > always trumps any need for telemetry data, no matter how legitimate.
> > 
> > These rules apply to all products released by KDE.
> > 
> > ## Transparency
> > 
> > We provide detailed information about the data that is going to be
> > shared, in
> > a way that:
> > - is easy to understand
> > - is precise and complete
> > - is available locally without network connectivity
> > 
> > Any changes or additions to the telemetry functionality of an
> > application will
> > be highlighted in the corresponding release announcement.
> > 
> > ## Control
> > 
> > We give the user full control over what data they want to share with
> > KDE. In
> > particular:
> > - application telemetry is always opt-in, that is off by default
> > - application telemetry settings can be changed at any time, and are
> > provided
> > as prominent in the application interface as other application settings
> > - applications honor system-wide telemetry settings where they exist
> > (global
> > "kill switch")
> > - we provide detailed documentation about how to control the
> > application
> > telemetry system
> > 
> > In order to ensure control over the data after it has been shared with
> > KDE,
> > applications will only transmit this data to KDE servers, that is
> > servers
> > under the full control of the KDE sysadmin team.
> > 
> > We will provide a designated contact point for users who have concerns
> > about
> > the data they have shared with KDE. While we are willing to delete data
> > a user
> > no longer wants to have shared, it should be understood that the below
> > rules
> > are designed to make identification of data of a specific user
> > impossible, and
> > thus a deletion request impractical.
> > 
> > ## Anonymity
> > 
> > We do not tr

Re: Telemetry Policy

2017-08-14 Thread Volker Krause

I agree on the proposed wording changes, so focusing on your technical points 
below.

On Monday, 14 August 2017 11:53:17 CEST Ben Cooksley wrote:
> I've got two technical notes here:
> 
> 1) All products should fetch details on where to submit telemetry data
> from an online configuration file similar to
> https://autoconfig.kde.org/ocs/providers.xml
> 
> This would give us the capacity to version the telemetry server api,
> and potentially even "kill" telemetry submissions from older
> application versions if needed.
> 
> 2) No software product should use the QNetworkAccessManager family of
> classes due to known defects in it's operation within some versions of
> Qt which cause infrastructure problems.

The current implementation uses QNAM, but actually has code to handle HTTP 
redirects correctly (with unit test coverage), I assume that's the issue you 
are referring to? This also has been tested all the way back to Qt4.8 as part 
of the existing deployment in GammaRay.

I don't mind adding the extra indirection with the configuration file, although 
just from the XML I don't see yet what that would provide beyond HTTP 
redirects. Are there certain information (e.g. the app version) passed already 
as part of the request for the configuration file? Or can there be conditional 
aspects not currently present in the above example?

Regards,
Volker

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-14 Thread Ben Cooksley

On Sun, Aug 13, 2017 at 9:47 PM, Volker Krause  wrote:
> Hi,

Hi Volker,

>
> during the KUserFeedback BoF at Akademy there was quite some interest in
> collecting telemetry data in KDE applications. But before actually
> implementing that we agreed to define the rules under which we would want to
> do that. I've tried to put the input we collected during Akademy into proper
> wording below. What do you think? Did I miss anything?
>
> Regards,
> Volker
>
>
> # Telemetry Policy Draft
>
> Application telemetry data can be a valuable tool for tailoring our products
> to the needs of our users. The following rules define how KDE collects and
> uses such application telemetry data. As privacy is of utmost importance to
> us, the general rule of thumb is to err on the side of caution here. Privacy
> always trumps any need for telemetry data, no matter how legitimate.
>
> These rules apply to all products released by KDE.
>
> ## Transparency
>
> We provide detailed information about the data that is going to be shared, in
> a way that:
> - is easy to understand
> - is precise and complete
> - is available locally without network connectivity
>
> Any changes or additions to the telemetry functionality of an application will
> be highlighted in the corresponding release announcement.
>
> ## Control
>
> We give the user full control over what data they want to share with KDE. In
> particular:
> - application telemetry is always opt-in, that is off by default
> - application telemetry settings can be changed at any time, and are provided
> as prominent in the application interface as other application settings
> - applications honor system-wide telemetry settings where they exist (global
> "kill switch")
> - we provide detailed documentation about how to control the application
> telemetry system
>
> In order to ensure control over the data after it has been shared with KDE,
> applications will only transmit this data to KDE servers, that is servers
> under the full control of the KDE sysadmin team.
>
> We will provide a designated contact point for users who have concerns about
> the data they have shared with KDE. While we are willing to delete data a user
> no longer wants to have shared, it should be understood that the below rules
> are designed to make identification of data of a specific user impossible, and
> thus a deletion request impractical.

Can we change "impractical" to "effectively impossible" here please?

>
> ## Anonymity
>
> We do not transmit data that could be used to identify a specific user. In
> particular:
> - we will not use any unique device, installation or user id
> - data is stripped of any unnecessary detail and downsampled appropriately
> before sharing to avoid fingerprinting
> - network addresses (which are exposed inevitably as part of the data
> transmission) are not stored together with the telemetry data, and must only
> be stored or used to the extend necessary for abuse counter-measures

I'm wary that people might jump on the network addresses bit here.

Can we please mention that all records that contain network addresses
and other similar information would be stored in such a form that they
could not be associated with telemetry records.

In terms of the logs - as there are other uses for them, i'd prefer if
we widened that to also allow them to be kept to allow us to maintain
the proper and effective operation of the telemetry system and other
associated services. The time we retain those logs should also be at
our complete and total discretion and if need be should be indefinite.

>
> ## Minimalism
>
> We only track the bare minimum of data necessary to answer specific questions,
> we do not collect data preemptively or for exploratory research. In
> particular, this means:
> - collected data  must have a clear purpose
> - data is downsampled to the maximum extend possible at the source
> - relevant correlations between individual bits of data should be computed at
> the source whenever possible
> - data collection is stopped once corresponding question has been answered
>
> ## Privacy
>
> We will never transmit anything containing user content, or even just hints at
> possible user content such as e.g. file names, URLs, etc.
>
> We will only ever track:
> - system information that are specific to the installation/environment, but
> independent of how the application/machine/installation is actually used
> - statistical usage data of an installation/application
>
> ## Compliance
>
> KDE only releases products capable of acquiring telemetry data if compliance
> with these rules has been established by a public review on [kde-core-devel|
> kde-community]@k

Re: Telemetry Policy

2017-08-14 Thread Mario Fux

Am Montag, 14. August 2017, 11:48:04 CEST schrieb Volker Krause:

Morning

[snip]

> > > At the moment, we can not even judge how much data we receive with that
> > > policy, we can still re discuss that if we really see we get not enough
> > > feedback to have usable data.
> > 
> > So general question: How do we communicate or offer to set it on?
> > First-run
> > dialog? Release notes?
> 
> Let's make sure to keep policy and current implementation separate to avoid
> confusion and to avoid tying the policy too much to what happens to be
> implemented right now.

Of course that makes sense.

> The policy doesn't state anything about "encouraging" to opt-in at this
> point. We probably want to add something about not "forcing" opt-in by for
> example not offering certain features only if telemetry in enabled. Beyond
> that it's mainly a marketing thing that doesn't need to be regulated by the
> telemetry policy IMHO.
> 
> The current implementation offers an encouragement system where it displays
> a passive popup after a certain time of using the application. It's
> intentionally not a first run dialog, that would generate a very bad first
> impression. But it's also intentionally a (recurring) in-app notification as
> I doubt we'd get much data if it's purely mentioned in the release notes,
> or the user just happens to find the option somewhere.

Thanks for this information regarding the current implementation.

> The participation ratio is one of the big unknowns so far, so this is an
> important aspect.

Of course but nonetheless thanks for your work on the KUserFeedback (soon-to-
be ;-) framework.

Mario

Re: Telemetry Policy

2017-08-14 Thread Volker Krause

On Monday, 14 August 2017 11:17:20 CEST Mario Fux wrote:
> Am Montag, 14. August 2017, 10:57:10 CEST schrieb Dr.-Ing. Christoph 
Cullmann:
> > Hi,
> 
> Morning
> 
> > > On 08/14/2017 05:30 PM, Ivan Čukić wrote:
> > >> Hi all,
> > >> 
> > >> While I do see the point behind 'off-by-default', I think it will ruin
> > >> the purpose since nobody will turn it on.
> > >> 
> > >> I'd propose having it on by default (at least) for pre-releases.
> > > 
> > > I'm not convinced. KDE publically runs on a platform of respecting
> > > user privacy (it's part of the vision). On-by-default metrics run
> > > the risk of someone calling bullshit on that, plus it's giving up
> > > on a privacy-related differentiator if, unlike others, we're off-
> > > by-default.
> > 
> > I would vote for "off-by-default", too.
> 
> I don't vote yet ;-).
> 
> > At the moment, we can not even judge how much data we receive with that
> > policy, we can still re discuss that if we really see we get not enough
> > feedback to have usable data.
> 
> So general question: How do we communicate or offer to set it on? First-run
> dialog? Release notes?

Let's make sure to keep policy and current implementation separate to avoid 
confusion and to avoid tying the policy too much to what happens to be 
implemented right now.

The policy doesn't state anything about "encouraging" to opt-in at this point. 
We probably want to add something about not "forcing" opt-in by for example 
not offering certain features only if telemetry in enabled. Beyond that it's 
mainly a marketing thing that doesn't need to be regulated by the telemetry 
policy IMHO.

The current implementation offers an encouragement system where it displays a 
passive popup after a certain time of using the application. It's 
intentionally not a first run dialog, that would generate a very bad first 
impression. But it's also intentionally a (recurring) in-app notification as I 
doubt we'd get much data if it's purely mentioned in the release notes, or the 
user just happens to find the option somewhere.

The participation ratio is one of the big unknowns so far, so this is an 
important aspect.

Regards,
Volker

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-14 Thread Ben Cooksley

On Sun, Aug 13, 2017 at 10:56 PM, Martin Flöser  wrote:
> Am 2017-08-13 11:47, schrieb Volker Krause:
>>
>> Hi,
>>
>> during the KUserFeedback BoF at Akademy there was quite some interest in
>> collecting telemetry data in KDE applications. But before actually
>> implementing that we agreed to define the rules under which we would want
>> to
>> do that. I've tried to put the input we collected during Akademy into
>> proper
>> wording below. What do you think? Did I miss anything?
>
>
> To me it looks good!
>
> I have some additional requests:
>  * the collected data must be made available to the public (mostly thinking
> of research institutes here)

I'm opposed to making the collected data available in anything other
than aggregated form because researchers have previously requested
data such as our Bugzilla database and we've gained nothing from it -
on the contrary it's cost us valuable contributor time (to perform the
necessary packaging to allow them to download the data and to
anonymise it in the case of Bugzilla).

>  * data must be made available under a CC license (CC0?)
>  * maybe allow the user to delete the dataset again (difficult as that
> conflicts with making the data public and would require authentication which
> is the opposite to anonymity).

Allowing anyone to nuke the dataset would defeat the purpose of
collecting it, and short of assigning people some kind of unique
identifier (which defeats the anonymity point) wouldn't allow them to
delete just their own data.

People would need to accept that whatever information is submitted is
not removable.

>
> Cheers
> Martin

Cheers,
Ben

>
>
>>
>> Regards,
>> Volker
>>
>>
>> # Telemetry Policy Draft
>>
>> Application telemetry data can be a valuable tool for tailoring our
>> products
>> to the needs of our users. The following rules define how KDE collects and
>> uses such application telemetry data. As privacy is of utmost importance
>> to
>> us, the general rule of thumb is to err on the side of caution here.
>> Privacy
>> always trumps any need for telemetry data, no matter how legitimate.
>>
>> These rules apply to all products released by KDE.
>>
>> ## Transparency
>>
>> We provide detailed information about the data that is going to be shared,
>> in
>> a way that:
>> - is easy to understand
>> - is precise and complete
>> - is available locally without network connectivity
>>
>> Any changes or additions to the telemetry functionality of an application
>> will
>> be highlighted in the corresponding release announcement.
>>
>> ## Control
>>
>> We give the user full control over what data they want to share with KDE.
>> In
>> particular:
>> - application telemetry is always opt-in, that is off by default
>> - application telemetry settings can be changed at any time, and are
>> provided
>> as prominent in the application interface as other application settings
>> - applications honor system-wide telemetry settings where they exist
>> (global
>> "kill switch")
>> - we provide detailed documentation about how to control the application
>> telemetry system
>>
>> In order to ensure control over the data after it has been shared with
>> KDE,
>> applications will only transmit this data to KDE servers, that is servers
>> under the full control of the KDE sysadmin team.
>>
>> We will provide a designated contact point for users who have concerns
>> about
>> the data they have shared with KDE. While we are willing to delete data a
>> user
>> no longer wants to have shared, it should be understood that the below
>> rules
>> are designed to make identification of data of a specific user impossible,
>> and
>> thus a deletion request impractical.
>>
>> ## Anonymity
>>
>> We do not transmit data that could be used to identify a specific user. In
>> particular:
>> - we will not use any unique device, installation or user id
>> - data is stripped of any unnecessary detail and downsampled appropriately
>> before sharing to avoid fingerprinting
>> - network addresses (which are exposed inevitably as part of the data
>> transmission) are not stored together with the telemetry data, and must
>> only
>> be stored or used to the extend necessary for abuse counter-measures
>>
>> ## Minimalism
>>
>> We only track the bare minimum of data necessary to answer specific
>> questions,
>> we do not collect data preemptively or for exploratory research. In
&g

Re: Telemetry Policy

2017-08-14 Thread Mario Fux

Am Montag, 14. August 2017, 10:57:10 CEST schrieb Dr.-Ing. Christoph Cullmann:
> Hi,

Morning

> > On 08/14/2017 05:30 PM, Ivan Čukić wrote:
> >> Hi all,
> >> 
> >> While I do see the point behind 'off-by-default', I think it will ruin
> >> the purpose since nobody will turn it on.
> >> 
> >> I'd propose having it on by default (at least) for pre-releases.
> > 
> > I'm not convinced. KDE publically runs on a platform of respecting
> > user privacy (it's part of the vision). On-by-default metrics run
> > the risk of someone calling bullshit on that, plus it's giving up
> > on a privacy-related differentiator if, unlike others, we're off-
> > by-default.
> 
> I would vote for "off-by-default", too.

I don't vote yet ;-).

> At the moment, we can not even judge how much data we receive with that
> policy, we can still re discuss that if we really see we get not enough
> feedback to have usable data.

So general question: How do we communicate or offer to set it on? First-run 
dialog? Release notes?

> Greetings
> Christoph

griits
Mario

Re: Telemetry Policy

2017-08-14 Thread Dr.-Ing. Christoph Cullmann

Hi,

> On 08/14/2017 05:30 PM, Ivan Čukić wrote:
>> Hi all,
>> 
>> While I do see the point behind 'off-by-default', I think it will ruin
>> the purpose since nobody will turn it on.
>> 
>> I'd propose having it on by default (at least) for pre-releases.
> 
> I'm not convinced. KDE publically runs on a platform of respecting
> user privacy (it's part of the vision). On-by-default metrics run
> the risk of someone calling bullshit on that, plus it's giving up
> on a privacy-related differentiator if, unlike others, we're off-
> by-default.
I would vote for "off-by-default", too.

At the moment, we can not even judge how much data we receive with that
policy, we can still re discuss that if we really see we get not enough
feedback to have usable data.

Greetings
Christoph

-- 
- Dr.-Ing. Christoph Cullmann -
AbsInt Angewandte Informatik GmbH  Email: cullm...@absint.com
Science Park 1 Tel:   +49-681-38360-22
66123 Saarbrücken  Fax:   +49-681-38360-20
GERMANYWWW:   http://www.AbsInt.com

Geschäftsführung: Dr.-Ing. Christian Ferdinand
Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234

Re: Telemetry Policy

2017-08-14 Thread Eike Hein

On 08/14/2017 05:30 PM, Ivan Čukić wrote:
> Hi all,
> 
> While I do see the point behind 'off-by-default', I think it will ruin
> the purpose since nobody will turn it on.
> 
> I'd propose having it on by default (at least) for pre-releases.

I'm not convinced. KDE publically runs on a platform of respecting
user privacy (it's part of the vision). On-by-default metrics run
the risk of someone calling bullshit on that, plus it's giving up
on a privacy-related differentiator if, unlike others, we're off-
by-default.

> Cheers,
> Ivan

Cheers,
Eike

Re: Telemetry Policy

2017-08-14 Thread Ivan Čukić

Hi all,

While I do see the point behind 'off-by-default', I think it will ruin
the purpose since nobody will turn it on.

I'd propose having it on by default (at least) for pre-releases.

Cheers,
Ivan

Re: Telemetry Policy

2017-08-13 Thread Ingo Klöcker

On Sunday 13 August 2017 12:56:27 Martin Flöser wrote:
> Am 2017-08-13 11:47, schrieb Volker Krause:
> > Hi,
> > 
> > during the KUserFeedback BoF at Akademy there was quite some
> > interest
> > in
> > collecting telemetry data in KDE applications. But before actually
> > implementing that we agreed to define the rules under which we would
> > want to
> > do that. I've tried to put the input we collected during Akademy
> > into
> > proper
> > wording below. What do you think? Did I miss anything?
> 
> To me it looks good!
> 
> I have some additional requests:
>   * the collected data must be made available to the public (mostly
> thinking of research institutes here)

I don't think so. And in view of "we do not collect data preemptively or 
for exploratory research" I think the draft agrees with me.

AFAIU, the (raw) collected data is meant to be used by us exclusively to 
answer specific questions. It is specifically not meant to be used by 
random researchers to draw conclusions and publish papers about our 
users. I'm not against publishing aggregates like "25 % of the users of 
KWin use Wayland", but only once a year or so.

Regards,
Ingo

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-13 Thread Thomas Baumgart

Hi Volker et al.

On Sonntag, 13. August 2017 11:47:28 CEST Volker Krause wrote:

> Hi,
> 
> during the KUserFeedback BoF at Akademy there was quite some interest in
> collecting telemetry data in KDE applications. But before actually
> implementing that we agreed to define the rules under which we would want to
> do that. I've tried to put the input we collected during Akademy into
> proper wording below. What do you think? Did I miss anything?

Very good. I follow the discussion about KUserFeedback from the beginning and 
representing the KDE application dealing with the personal finances this is a 
very delicate topic for our application.

The details of the policy are very well layed out and written with the user 
and his control over data and device in mind. This is where some for-profits 
are able to learn a thing or two from us.

Despite the fact that the data should be aggregated, anonymized as much as 
possible on the source deveice, I think it would make sense to guarantee an 
encrypted transfer of the raw records from the user's computer to the KDE 
infra. What do you think?

Keep going. This is wonderful.

-- 

Regards

Thomas Baumgart

https://www.telegram.org/   Telegram, the better WhatsApp
-
The only 'intuitive' interface is the nipple. After that, it's all learned.
 -- Bruce Ediger, bedi...@teal.csn.org, on X interfaces
-

signature.asc
Description: This is a digitally signed message part.

Re: Telemetry Policy

2017-08-13 Thread Martin Flöser


Am 2017-08-13 11:47, schrieb Volker Krause:

Hi,

during the KUserFeedback BoF at Akademy there was quite some interest 
in

collecting telemetry data in KDE applications. But before actually
implementing that we agreed to define the rules under which we would 
want to
do that. I've tried to put the input we collected during Akademy into 
proper

wording below. What do you think? Did I miss anything?


To me it looks good!

I have some additional requests:
 * the collected data must be made available to the public (mostly 
thinking of research institutes here)

 * data must be made available under a CC license (CC0?)
 * maybe allow the user to delete the dataset again (difficult as that 
conflicts with making the data public and would require authentication 
which is the opposite to anonymity).


Cheers
Martin



Regards,
Volker


# Telemetry Policy Draft

Application telemetry data can be a valuable tool for tailoring our 
products
to the needs of our users. The following rules define how KDE collects 
and
uses such application telemetry data. As privacy is of utmost 
importance to
us, the general rule of thumb is to err on the side of caution here. 
Privacy

always trumps any need for telemetry data, no matter how legitimate.

These rules apply to all products released by KDE.

## Transparency

We provide detailed information about the data that is going to be 
shared, in

a way that:
- is easy to understand
- is precise and complete
- is available locally without network connectivity

Any changes or additions to the telemetry functionality of an 
application will

be highlighted in the corresponding release announcement.

## Control

We give the user full control over what data they want to share with 
KDE. In

particular:
- application telemetry is always opt-in, that is off by default
- application telemetry settings can be changed at any time, and are 
provided

as prominent in the application interface as other application settings
- applications honor system-wide telemetry settings where they exist 
(global

"kill switch")
- we provide detailed documentation about how to control the 
application

telemetry system

In order to ensure control over the data after it has been shared with 
KDE,
applications will only transmit this data to KDE servers, that is 
servers

under the full control of the KDE sysadmin team.

We will provide a designated contact point for users who have concerns 
about
the data they have shared with KDE. While we are willing to delete data 
a user
no longer wants to have shared, it should be understood that the below 
rules
are designed to make identification of data of a specific user 
impossible, and

thus a deletion request impractical.

## Anonymity

We do not transmit data that could be used to identify a specific user. 
In

particular:
- we will not use any unique device, installation or user id
- data is stripped of any unnecessary detail and downsampled 
appropriately

before sharing to avoid fingerprinting
- network addresses (which are exposed inevitably as part of the data
transmission) are not stored together with the telemetry data, and must 
only

be stored or used to the extend necessary for abuse counter-measures

## Minimalism

We only track the bare minimum of data necessary to answer specific 
questions,

we do not collect data preemptively or for exploratory research. In
particular, this means:
- collected data  must have a clear purpose
- data is downsampled to the maximum extend possible at the source
- relevant correlations between individual bits of data should be 
computed at

the source whenever possible
- data collection is stopped once corresponding question has been 
answered


## Privacy

We will never transmit anything containing user content, or even just 
hints at

possible user content such as e.g. file names, URLs, etc.

We will only ever track:
- system information that are specific to the installation/environment, 
but
independent of how the application/machine/installation is actually 
used

- statistical usage data of an installation/application

## Compliance

KDE only releases products capable of acquiring telemetry data if 
compliance
with these rules has been established by a public review on 
[kde-core-devel|
kde-community]@kde.org from at least two reviewers. The review has to 
be
repeated for every release if changes have been made to how/what data 
is

collected.

Received data is regularly reviewed for violations of these rules, in
particular for data that is prone to fingerprinting. Should such 
violations be
found, the affected data will be deleted, and data recording will be 
suspended
until compliance with these rules has been established again. In order 
to
enable reviewing of the data, every KDE contributor with a developer 
account

will have access to all telemetry data gathered by any KDE product.

Re: Telemetry Policy

2017-08-13 Thread Martin Steigerwald

Hello Volker.

Volker Krause - 13.08.17, 11:47:
> during the KUserFeedback BoF at Akademy there was quite some interest in 
> collecting telemetry data in KDE applications. But before actually 
> implementing that we agreed to define the rules under which we would want
> to  do that. I've tried to put the input we collected during Akademy into
> proper wording below. What do you think? Did I miss anything?

I just want to applaud you people at the KUserFeedback BoF.

I think I have never seen an approach that is so cautionary about privacy as 
this one. Its really refreshing to see such utmost care. It shows the 
excellence of this community.

Actually… with an approach like this even I might agree to have some data 
collected. :)

I have only one idea left:

How about an option to store transmitted data locally as well, at least for a 
certain amount of time so the user can actually review what data has been 
send? And a log on when it was send. This way an user can actually audit the 
data if she so desires. This would improve the accountability on the promises 
you give in the Telemetry Policy Draft. Maybe the user would never look at the 
data, but knowing that it is there, just in case, can help to build his trust.

Thanks,
-- 
Martin

Telemetry Policy

2017-08-13 Thread Volker Krause

Hi,

during the KUserFeedback BoF at Akademy there was quite some interest in 
collecting telemetry data in KDE applications. But before actually 
implementing that we agreed to define the rules under which we would want to 
do that. I've tried to put the input we collected during Akademy into proper 
wording below. What do you think? Did I miss anything?

Regards,
Volker


# Telemetry Policy Draft

Application telemetry data can be a valuable tool for tailoring our products 
to the needs of our users. The following rules define how KDE collects and 
uses such application telemetry data. As privacy is of utmost importance to 
us, the general rule of thumb is to err on the side of caution here. Privacy 
always trumps any need for telemetry data, no matter how legitimate.

These rules apply to all products released by KDE.

## Transparency

We provide detailed information about the data that is going to be shared, in 
a way that:
- is easy to understand
- is precise and complete
- is available locally without network connectivity

Any changes or additions to the telemetry functionality of an application will 
be highlighted in the corresponding release announcement.

## Control

We give the user full control over what data they want to share with KDE. In 
particular:
- application telemetry is always opt-in, that is off by default
- application telemetry settings can be changed at any time, and are provided 
as prominent in the application interface as other application settings
- applications honor system-wide telemetry settings where they exist (global 
"kill switch")
- we provide detailed documentation about how to control the application 
telemetry system

In order to ensure control over the data after it has been shared with KDE, 
applications will only transmit this data to KDE servers, that is servers 
under the full control of the KDE sysadmin team.

We will provide a designated contact point for users who have concerns about 
the data they have shared with KDE. While we are willing to delete data a user 
no longer wants to have shared, it should be understood that the below rules 
are designed to make identification of data of a specific user impossible, and 
thus a deletion request impractical.

## Anonymity

We do not transmit data that could be used to identify a specific user. In 
particular:
- we will not use any unique device, installation or user id
- data is stripped of any unnecessary detail and downsampled appropriately 
before sharing to avoid fingerprinting
- network addresses (which are exposed inevitably as part of the data 
transmission) are not stored together with the telemetry data, and must only 
be stored or used to the extend necessary for abuse counter-measures

## Minimalism

We only track the bare minimum of data necessary to answer specific questions, 
we do not collect data preemptively or for exploratory research. In 
particular, this means:
- collected data  must have a clear purpose
- data is downsampled to the maximum extend possible at the source
- relevant correlations between individual bits of data should be computed at 
the source whenever possible
- data collection is stopped once corresponding question has been answered

## Privacy

We will never transmit anything containing user content, or even just hints at 
possible user content such as e.g. file names, URLs, etc.

We will only ever track:
- system information that are specific to the installation/environment, but 
independent of how the application/machine/installation is actually used
- statistical usage data of an installation/application

## Compliance

KDE only releases products capable of acquiring telemetry data if compliance 
with these rules has been established by a public review on [kde-core-devel|
kde-community]@kde.org from at least two reviewers. The review has to be 
repeated for every release if changes have been made to how/what data is 
collected.

Received data is regularly reviewed for violations of these rules, in 
particular for data that is prone to fingerprinting. Should such violations be 
found, the affected data will be deleted, and data recording will be suspended 
until compliance with these rules has been established again. In order to 
enable reviewing of the data, every KDE contributor with a developer account 
will have access to all telemetry data gathered by any KDE product.


signature.asc
Description: This is a digitally signed message part.

96 matches

Mail list logo