Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-23 Thread Michael Reichert
Hi Roland,

Am 2018-06-20 um 20:16 schrieb Roland Olbricht:
> On the technical side, things are even worse. The elephant in the room
> is OAuth. OAuth is built on in particular the assumptions that
> - the consumer ("the website") acts stateful
> - sessions are relatively long-lived, i.e. some seconds to some hours
> - the identity provider has the cross-origin assets
> All three are not true for Overpass API which means that I have to work
> around OAuth or significantly mess with it.
> 
> For example, implementing to have sessions on Overpass API will require
> to develop a full-fledged security system to deal with the hundres of
> potential modes of attacks on session based systms. Even if that works,
> the median runtime for a request on Overpass API is well below a second,
> and just the roundtrip times for the OAuth threesome communication sum
> up to more. We have not even started to talk about the plethora of error
> messages that need to be formulated, explained, and implemented.

You do not need the full roundtrip for each request. I have implemented
the authentication of the protected part of Geofabrik's download service
(https://osm-internal.download.geofabrik.de/). Its source code can be
found at https://github.com/geofabrik/sendfile_osm_oauth_protector

1. If a user requests a protected resource https://HOST/PATH for the
first time, he will receive the landing page containing a link to
https://HOST/PATH?show_landing_page.

2. If he follows this link, the web application will check if he
attached a cookie to his request. If no cookie was attached, the
application will retrieve a temporary request token from
https://www.openstreetmap.org/oauth/request_token and reply with a
redirect (302 Found) to
https://www.openstreetmap.org/authorize?oauth_token=X_token_secret=Y_callback=https://HOST/PATH?oauth_token_secret_encry=Y_ENCRYTPED

3. The browser will call the URL in the Location header of the response
of item 2. If the user is already logged in into OSM, he will be asked
to grant a permission to the application "Geofabrik Downloads".
Otherwise, he has to log in first.

4. If the user grants the permission and clicks on "Grant permissions",
his browser fill send a HTTP POST request to
https://www.openstreetmap.org/oauth/authorize. The OSM website will
respond with code 302 and pointing him to
https://HOST/PATH?oauth_token_secret_encr=Y_ENCRYPTED_token=X

5. The user calls (HTTP GET)
https://HOST/PATH?oauth_token_secret_encr=Y_ENCRYPTED_token=X. The
web application of the download server recognizes the URL parameters
oauth_token and oauth_token_secret. The web application retrieves a
permanent OAuth access token from the OSM API by calling
https://www.openstreetmap.org/access_token. If that works, it is able to
call https://api.openstreetmap.org/api/0.6/user/details (with the
permanent access token in the HTTP Authentication header). If this
request does not fail, the access token is valid and the web application
has ensured that the client has a valid OSM account. The web application
sets a cookie as described in
https://github.com/geofabrik/sendfile_osm_oauth_protector/blob/master/doc/cookie.md
and responds with the requested resource.

The cookies contains the login status (unencrypted, unsigned), the name
of the key set which was used by the server to encrypt and sign the
cookie and a encrypted and signed part consisting of the access token,
the access token and the expiry date (48 hours).

6. The client sends this cookie with all future requests to the server.
The server decrypts the cookie and checks the signature. If it is ok and
the expiry date has not passed yet, the request is answered immediately
without further OAuth round trips. If the expiry date has passed, the
server responds with a redirect (code 302) as described in item 2.

Our solution does not need any session management on our side. The
session IDs are stored in the cookies. They are encrypted and signed to
prevent clients to manipulate them (or the expiry date).

It would be possible to avoid the round trip every 48 hours if the
server calls https://api.openstreetmap.org/api/0.6/user/details again
using the permanent access token (it's in the cookie). This means, you
could revalidate the validity of the OSM account every 48 hours.
However, this feature bears a security risk. It is implemented in my
tool but we at Geofabrik decided not to use it. If a user accidentally
publishes his cookie on GitHub (e.g. forgotten to remove it from the
invokation of curl), someone else could use it forever (until the access
token is revoked by the user which usually does not happen). Instead, we
require the user to re-enter his OSM account credentials every 48 hours
and require such malicious users to publish their OSM account credentials.

> On top of that, the OAuth idea means that each and every sequence of
> user data access will trigger an event on the central OSM OAuth server.
> This is quite Orwellian. Even if you do not store that 

Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-21 Thread Frederik Ramm
Roland,

the changes that I proposed mean that your Overpass API will, if it
wants to continue downloading user data from OSM, at some point in the
future have to identify itself to OSM with an OSM account as proof of
your acceptance of the Terms of use.

This is the *technical* requirement for having access to OSM user data
in the future, and it is easy to do. I'm happy to provide the necessary
script for that when the time comes.

Overpass already differentiates between output with and output without
meta data. The output without meta data, which IMHO is totally
sufficient for the overwhelming amount of Overpass use cases, would
continue unchanged.

So these use cases are all covered without any of us investing any work,
without a "development backlog of more than a year" or killing the
project entirely.

Let's look at those use cases where Overpass users would like to
download user data.

You seem to assume that this not only requires the overpass user to have
an OSM account but also that the overpass user somehow goes through an
OAuth process with OSM every time they want to access Overpass.

This is *not* intended to be a requirement.

The ToU will require - in wording that is yet to be defined - that you
take care to only distribute OSM user data for purposes that the OSMF
considers legitimate. Now it is clear that you cannot actually *control*
what users do with data - but you will be expected to inform them that
they have to conform to the OSMF's rules when they process this data.

One *possible* way of doing that would be to simply have them prove that
they have an OSM account, because if they have an account, then they
also have accepted the ToU, and then you don't have to explain anything
to them. This *could* be done with OAuth, either with every request they
send, or you could have your own database of Overpass API keys where
people have to prove they have an OSM account when they register.

But you could also run a scheme completely independent of OSM, where
anyone can register for an "Overpass account" and you show them some
text that says "By signing up for an Overpass account you promise to
always stick to OSM's terms of use" or so.

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09" E008°23'33"

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-20 Thread Ian Dees
On Wed, Jun 20, 2018 at 2:33 PM Christoph Hormann  wrote:

>
> I assume if this is actually the case will depend on the specifics of
> the OSMF ToU.  I would also assume that (b) most likely would not
> require you to use OAuth with every request, you probably could just
> use OAuth when people register with you for an API key.
>
>
I don't mean to speak for Roland, but Overpass doesn't require any sort of
OAuth or API key access restriction. Adding such a thing (as the proposed
changes seem to require for Overpass to continue) is a significant change
(including storing data about users that is not currently required) and
Roland might not want to undertake such a project.
___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-20 Thread Christoph Hormann
On Wednesday 20 June 2018, Roland Olbricht wrote:
> [...]
> Taking GDPR serious means every data processor must decide which use
> cases they make simple, which use cases they make hard, and tailor
> the documentation according to that. For example, for that reason
> Overpass API has no feature to track all actions of a single user. I
> have proposed a declaration tailored to Overpass API on the FOSSGIS
> list (the FOSSGIS is sponsoring the server operations), and I would
> prefer to go forward with that one. A central ToU does not help,
> hence having it ticked or not is of no interest to the data
> processor.

Since not everyone knows the draft you suggested in FOSSGIS - the plans 
you sketch there (correct me please if i am wrong) essentially say that 
you intend to continue distributing geodata and timestamps without 
access restrictions but plan to manage restricted access to other data 
(changesets and user identities) using your own mechanism and own 
criteria of approval (which are not completely finalized yet).

As i understand your mail here you think this clashes with the OSMF 
plans because these will require you - for accessing the raw data to 
feed into the Overpass API - to accept the OSMF ToU which likely will 

a) not allow you to distribute data with timestamps without access 
restrictions
b) require you to implement access restrictions using OAuth

I assume if this is actually the case will depend on the specifics of 
the OSMF ToU.  I would also assume that (b) most likely would not 
require you to use OAuth with every request, you probably could just 
use OAuth when people register with you for an API key.

-- 
Christoph Hormann
http://www.imagico.de/

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-20 Thread Bryan Housel
> On the technical side, things are even worse. The elephant in the room is 
> OAuth. OAuth is built on in particular the assumptions that
> - the consumer ("the website") acts stateful
> - sessions are relatively long-lived, i.e. some seconds to some hours
> - the identity provider has the cross-origin assets
> All three are not true for Overpass API which means that I have to work 
> around OAuth or significantly mess with it.

Just wanted to respond to the technical part of this - my impression was that 
embedding a policy change into an OAuth flow wouldn’t be too intrusive.

I was assuming that server side they would just revoke everyone’s OAuth tokens 
for certain apps (essentially forcing everyone as logged out).

When using the OAuth app, at some point the user would need to log in.  They'd 
be presented with the same screen requesting account permissions, but then 
might be redirected through an extra screen that explains the privacy policy 
and asks the user to read and check a box before continuing.  This screen could 
appear only if their account hasn’t already accepted the policy.  Finally OAuth 
would call back to your app with the secrets like it normally would.

I could be misunderstanding - Hopefully someone will correct me if I’m wrong :)

Thanks, Bryan


___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-20 Thread Simon Poole
Just as a clarification:

- we do intend to have ToS for both the website and the API, that among
other things address privacy aspects (a 1st draft went out for comment
to the OSMF board and the WGs today, and if no major blockers are found
will be available for public comment rsn).

- I expect that access to the "raw" data including user data will only
be possible with a login and correspondingly agreeing to the ToU

- use of OSM data within the limits of the ToU will likely not cause any
privacy related issues, so the barrier of having an OSM account and
agreeing to the ToU would seem to be enough for the typical use of
contributors,

- use of the data as in say osmcha, Pascals services and so on, which
would not be covered by the ToU, does raise privacy issues and such
consumers/distributors of OSM data are expected to act as independent
controllers as already outlined. Essentially what we can offer as
support there is the already mentioned list of deleted accounts, and
providing all contributors with a list of data controllers to fulfil the
obligations in GDPR Art. 14.

As said some details still need to be worked out, so I wouldn't take it
as gospel right now.

Simon


Am 20.06.2018 um 20:16 schrieb Roland Olbricht:
> Hi,
>
> brief and frank: The suggested way that users of Overpass API have to
> sign up as OSM users would cause a downtime of some months and a
> development backlog of more than a year, or kill the project entirely.
> Because this sounds harsh, I will explain that further down.
>
> The key point is: please do not bind information intended for data
> processors to OSM user accounts.
>
>> The only alternatives I can see would be:
>>
>> 1. Stop distributing who-did-what-when information
>> [...] it would create a privileged class inside OSM
>> [...] 2. Take the view that distributing the data is what we do and
>> tough
>> luck, you've signed up to it.
>
> As Simon has pointed out there is another alternative. And I have
> understood so far the OSMF that way wanted to follow that way:
>
>> as has been outlined before, 3rd parties using OSM data with user data
>> will be acting as independent data controllers and will not be
>> processing data on behalf of the OSMF (which would require a DPA and all
>> the associated complications). They will have to make their own
>> determinations on how to deal with the situation. We will  provide some
>> support to such entities to help them fulfil their legal obligations
>> (for example a list of deleted users), but that's it.
>
> In particular, data processors do a much better job if they do not
> deal with OSM accounts at all, avoiding having and triggering extra
> who-viewed-what data.
>
> Most important, privacy relevance varies heavily with context. Hence I
> will and should inform users about different risks than the OSMF, and
> HDYC may again decide to stress other aspects. A central ToU cannot do
> that. Also, for that reason the GDPR is a law and not a suggestion for
> a contract, and the OSMF decided to handle it as such.
>
> To give an analogy, think of blades. It is forbidden by law to injure
> or kill someone, and blades of any kind do pose a risk. Kitchen knives
> can be used to stubb someone, but nobody every got stubbed with a
> kitchen blender. By contrast, user may harm themselves when using a
> kitchen blender. For that reason, you should be informed about the
> blades in the kitchen blender's manual, but no knife salesman in the
> world would require you to sign a contract not to stubb someone else
> with the knife. Conversely, giving too detailed information what
> approaches of stubbing are physologically risky and which are harmless
> could be abused as how-to-stub instructions.
>
> Taking GDPR serious means every data processor must decide which use
> cases they make simple, which use cases they make hard, and tailor the
> documentation according to that. For example, for that reason Overpass
> API has no feature to track all actions of a single user. I have
> proposed a declaration tailored to Overpass API on the FOSSGIS list
> (the FOSSGIS is sponsoring the server operations), and I would prefer
> to go forward with that one. A central ToU does not help, hence having
> it ticked or not is of no interest to the data processor.
>
> Then there is the problem that regardless whether you expect that OSM
> users will read or just tick the box, you have downsides:
> - If you expect that users do read the ToU then we will scare away
> users that just signed up to fix a POI and find themselves scrolling
> through pages of legalese on a mobile phone
> - If you do not expect that users read the ToU then the bad guys in
> particular won't do, and no judge ever would count that as an
> appropriate technical protection of data
>
> In addition, this is stealing users' attention from more important
> matter. Our contributor terms have quite some content and that so for
> a reason.
>
> On the technical side, things are even worse. The 

Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-20 Thread Roland Olbricht

Hi,

brief and frank: The suggested way that users of Overpass API have to 
sign up as OSM users would cause a downtime of some months and a 
development backlog of more than a year, or kill the project entirely.

Because this sounds harsh, I will explain that further down.

The key point is: please do not bind information intended for data 
processors to OSM user accounts.



The only alternatives I can see would be:

1. Stop distributing who-did-what-when information
[...] it would create a privileged class inside OSM
[...] 2. Take the view that distributing the data is what we do and tough
luck, you've signed up to it.


As Simon has pointed out there is another alternative. And I have 
understood so far the OSMF that way wanted to follow that way:



as has been outlined before, 3rd parties using OSM data with user data
will be acting as independent data controllers and will not be
processing data on behalf of the OSMF (which would require a DPA and all
the associated complications). They will have to make their own
determinations on how to deal with the situation. We will  provide some
support to such entities to help them fulfil their legal obligations
(for example a list of deleted users), but that's it.


In particular, data processors do a much better job if they do not deal 
with OSM accounts at all, avoiding having and triggering extra 
who-viewed-what data.


Most important, privacy relevance varies heavily with context. Hence I 
will and should inform users about different risks than the OSMF, and 
HDYC may again decide to stress other aspects. A central ToU cannot do 
that. Also, for that reason the GDPR is a law and not a suggestion for a 
contract, and the OSMF decided to handle it as such.


To give an analogy, think of blades. It is forbidden by law to injure or 
kill someone, and blades of any kind do pose a risk. Kitchen knives can 
be used to stubb someone, but nobody every got stubbed with a kitchen 
blender. By contrast, user may harm themselves when using a kitchen 
blender. For that reason, you should be informed about the blades in the 
kitchen blender's manual, but no knife salesman in the world would 
require you to sign a contract not to stubb someone else with the knife. 
Conversely, giving too detailed information what approaches of stubbing 
are physologically risky and which are harmless could be abused as 
how-to-stub instructions.


Taking GDPR serious means every data processor must decide which use 
cases they make simple, which use cases they make hard, and tailor the 
documentation according to that. For example, for that reason Overpass 
API has no feature to track all actions of a single user. I have 
proposed a declaration tailored to Overpass API on the FOSSGIS list (the 
FOSSGIS is sponsoring the server operations), and I would prefer to go 
forward with that one. A central ToU does not help, hence having it 
ticked or not is of no interest to the data processor.


Then there is the problem that regardless whether you expect that OSM 
users will read or just tick the box, you have downsides:
- If you expect that users do read the ToU then we will scare away users 
that just signed up to fix a POI and find themselves scrolling through 
pages of legalese on a mobile phone
- If you do not expect that users read the ToU then the bad guys in 
particular won't do, and no judge ever would count that as an 
appropriate technical protection of data


In addition, this is stealing users' attention from more important 
matter. Our contributor terms have quite some content and that so for a 
reason.


On the technical side, things are even worse. The elephant in the room 
is OAuth. OAuth is built on in particular the assumptions that

- the consumer ("the website") acts stateful
- sessions are relatively long-lived, i.e. some seconds to some hours
- the identity provider has the cross-origin assets
All three are not true for Overpass API which means that I have to work 
around OAuth or significantly mess with it.


For example, implementing to have sessions on Overpass API will require 
to develop a full-fledged security system to deal with the hundres of 
potential modes of attacks on session based systms. Even if that works, 
the median runtime for a request on Overpass API is well below a second, 
and just the roundtrip times for the OAuth threesome communication sum 
up to more. We have not even started to talk about the plethora of error 
messages that need to be formulated, explained, and implemented.


On top of that, the OAuth idea means that each and every sequence of 
user data access will trigger an event on the central OSM OAuth server. 
This is quite Orwellian. Even if you do not store that information, your 
friendly agency of choice will do so on the line that connects the server.


Additionally, if you monitor "independend processors" so closely, it is 
questionable whether they are not seen as disguised contractors by a judge.


I can live with the 

Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-20 Thread Simon Poole


Am 20.06.2018 um 16:50 schrieb Jochen Topf:
> On Wed, Jun 20, 2018 at 03:08:52PM +0200, Frederik Ramm wrote:
>>> instead of arguing that this data needs to be public for everyone.
>> Any judge will laugh at you if you say that the information that user
>> John Smith has mapped something at 4:23 on the 3rd of January needs to
>> be public for everyone. Why would it, outside of a very narrow number of
>> QA related use cases?
> If you publish a book you can't later force everybody who bought the book
> to remove the author either. 
Wrong analogy.

Nobody is asking that pure consumers of OSM data to hunt through their
data and remove deleted users ids and display names,  but just as a 
book could be removed from circulation by the author/publisher etc. and
they could very well ask downstream book store and distributors to stop
selling and distributing the book, we and downstream distributors of the
data  need to stop distributing and publishing user data when the user
wants it deleted. To reuse your analogy we are removing the authors name
from new books sold and allowing it to otherwise remain in circulation.

Simon


> The law shouldn't be applicable to what we
> are doing here and we should argue that it isn't. If we bend ourselves
> backwards to try to comply to something that doesn't fit and can't fit,
> we have already lost the political case and the legal case, and broken
> the openness this project needs.
>
> Jochen




signature.asc
Description: OpenPGP digital signature
___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-20 Thread Mateusz Konieczny
20. Jun 2018 16:50 by joc...@remote.org :


> The law shouldn't be applicable to what we
> are doing here




If you think that law is wrong and should be modified I suggest lobbying 
elsewhere

than on mailing list discussing technical issues related to OSM.


 

> we have already lost the political case and the legal case, and broken
> the openness this project needs.
>




Personally, I see no breaking openess in slightly restricting how

detailed history info is distributed and recording that changeset data 


is not supposed to be used for stalking mappers.

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-20 Thread Jochen Topf
On Wed, Jun 20, 2018 at 03:08:52PM +0200, Frederik Ramm wrote:
> > instead of arguing that this data needs to be public for everyone.
> 
> Any judge will laugh at you if you say that the information that user
> John Smith has mapped something at 4:23 on the 3rd of January needs to
> be public for everyone. Why would it, outside of a very narrow number of
> QA related use cases?

If you publish a book you can't later force everybody who bought the book
to remove the author either. The law shouldn't be applicable to what we
are doing here and we should argue that it isn't. If we bend ourselves
backwards to try to comply to something that doesn't fit and can't fit,
we have already lost the political case and the legal case, and broken
the openness this project needs.

Jochen
-- 
Jochen Topf  joc...@remote.org  https://www.jochentopf.com/  +49-351-31778688

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-20 Thread Frederik Ramm
Hi,

On 06/20/18 11:38, Jochen Topf wrote:
> And if you actually want to make sure that redacted data (because the
> user wanted it to be deleted) is deleted downstream also, 

We will not try to "make sure" that this happens, but we plan to offer
help for downstream data processors, likely by publishing some sort of
feed or list of user deletions and redactions. This hasn't been specced
out yet, and doesn't have to be at this point in time.

We sure as hell don't intend to track and record who accesses OSM user data.

> It might be "the least disruptive", but if it doesn't make any sense,
> that doesn't make it better. Any judge will laugh at you if you tell
> them: Well, we trust the million users we already have and the other 6
> billion who can sign on to OSM anonymously more than we trust the
> general public.

I think that setting out clear terms for the users we already have and
those who might sign up in the future *is* a step in the right
direction. It conveys the message that personal data isn't handed out
willy-nilly, and that you have a certain responsibility when dealing
with it.

> It is a step towards making the project more closed and burying it in
> burocracy. 

I don't see that.

> You are ceding ground

This isn't a war between us and the EU in which we "cede ground". I
don't even think that being able to bandy around personal information
about OSM users is a goal worth fighting for. If I had to decide whether
the fact that everyone can stalk our mappers using OSM data is a
necessary side effect of our work, or counter to our interests, I would
probably lean towards the latter.

> instead of arguing that this data needs to be public for everyone.

Any judge will laugh at you if you say that the information that user
John Smith has mapped something at 4:23 on the 3rd of January needs to
be public for everyone. Why would it, outside of a very narrow number of
QA related use cases?

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09" E008°23'33"

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-20 Thread Michał Brzozowski
If OSM metadata is believed by OSMF to be personal data, so should be
photos added to Wikimedia Commons with a geotag. If anything, it's a
stronger proof that the user was there. I wonder what their legal team
thinks of it.

śr., 20 cze 2018, 11:41 użytkownik Jochen Topf  napisał:

> On Wed, Jun 20, 2018 at 09:03:01AM +0200, Frederik Ramm wrote:
> > > All of
> > > this needs to be tied in the OAuth stuff and it has to be done in a way
> > > that 3rd party services using OSM data can ask *their* downstream users
> > > to identify in the same way which allows OSM to track everybody who
> uses
> > > the full OSM data everywhere adding more personal data to keep and to
> > > explain to users and get permissions from users for.
> >
> > No, there's a mistake in your reasoning here.
> >
> > It is true that downstream data distributors like Overpass or the
> > Geofabrik downloads need to be able to verify whether someone has an OSM
> > account or not. Pascal has been doing that for ages on his HDYC site,
> > for example.
> >
> > But downstream data distibutors do not need to know or store anything
> > more than that; the Geofabrik download server for example will not even
> > store the user name of the person who has logged in, just that "whoever
> > is here has just proven they have an OSM account". So the downstream
> > distributor can deal with this without processing any personal data. (It
> > would be possible to extend our OAuth system by a call that would not
> > even return the user's identity to the caller - currently the identity
> > is returned to the caller and the caller must then decide whether to
> > process it or not.)
>
> It doesn't matter if you store the user name or not. If you ask somebody
> to enter personal information, you have to tell them them what this is
> for. The user doesn't understand how OAuth works or how it is
> configured, so for them both the downstream site and OSMF get the
> personal information, so you have to explain to the user what's
> happening, even if you don't store the data for more than the few
> milliseconds it needs to authenticate them. And the downstream site has
> to make the user aware of any restrictions, too.
>
> And chances are all of this will end up in some logfiles unless
> everybody makes sure it doesn't.
>
> And if you actually want to make sure that redacted data (because the
> user wanted it to be deleted) is deleted downstream also, you have to
> know who you gave this data to and inform them or find some other way
> of informing them.
>
> > > Please stop this nonsense now!
> >
> > Given these alternatives, I think the course currently followed by the
> > OSMF is the least disruptive.
>
> It might be "the least disruptive", but if it doesn't make any sense,
> that doesn't make it better. Any judge will laugh at you if you tell
> them: Well, we trust the million users we already have and the other 6
> billion who can sign on to OSM anonymously more than we trust the
> general public.
>
> I don't know what the right way of handling this is, but I do know that
> this isn't the right way. It isn't even a step in the right direction.
> It is a step towards making the project more closed and burying it in
> burocracy. You are ceding ground leading into a morass of legal details
> instead of arguing that this data needs to be public for everyone.
>
> Jochen
> --
> Jochen Topf  joc...@remote.org  https://www.jochentopf.com/
> +49-351-31778688
>
> ___
> dev mailing list
> dev@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/dev
>
___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-20 Thread Martin Koppenhoefer
2018-06-20 9:26 GMT+02:00 Simon Poole :

>  There are still some open questions on
> exactly what needs to be done, in particular wrt transfers of data to
> countries where the EU hasn't made an equivalence determination, but we
> are slowly firming that up.
>



For reference, the countries that have been determined equivalent are these
(plus the EEA countries):

The European Commission has so far recognised Andorra, Argentina, Canada
(commercial organisations), Faroe Islands, Guernsey, Israel, Isle of Man,
Jersey, New Zealand, Switzerland, Uruguay and the US (limited to the Privacy
Shield framework
)
as providing adequate protection.

Adequacy talks are ongoing with Japan and South Korea.
Cheers,
Martin
___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-20 Thread Jochen Topf
On Wed, Jun 20, 2018 at 09:03:01AM +0200, Frederik Ramm wrote:
> > All of
> > this needs to be tied in the OAuth stuff and it has to be done in a way
> > that 3rd party services using OSM data can ask *their* downstream users
> > to identify in the same way which allows OSM to track everybody who uses
> > the full OSM data everywhere adding more personal data to keep and to
> > explain to users and get permissions from users for.
> 
> No, there's a mistake in your reasoning here.
> 
> It is true that downstream data distributors like Overpass or the
> Geofabrik downloads need to be able to verify whether someone has an OSM
> account or not. Pascal has been doing that for ages on his HDYC site,
> for example.
> 
> But downstream data distibutors do not need to know or store anything
> more than that; the Geofabrik download server for example will not even
> store the user name of the person who has logged in, just that "whoever
> is here has just proven they have an OSM account". So the downstream
> distributor can deal with this without processing any personal data. (It
> would be possible to extend our OAuth system by a call that would not
> even return the user's identity to the caller - currently the identity
> is returned to the caller and the caller must then decide whether to
> process it or not.)

It doesn't matter if you store the user name or not. If you ask somebody
to enter personal information, you have to tell them them what this is
for. The user doesn't understand how OAuth works or how it is
configured, so for them both the downstream site and OSMF get the
personal information, so you have to explain to the user what's
happening, even if you don't store the data for more than the few
milliseconds it needs to authenticate them. And the downstream site has
to make the user aware of any restrictions, too.

And chances are all of this will end up in some logfiles unless
everybody makes sure it doesn't.

And if you actually want to make sure that redacted data (because the
user wanted it to be deleted) is deleted downstream also, you have to
know who you gave this data to and inform them or find some other way
of informing them.

> > Please stop this nonsense now!
>
> Given these alternatives, I think the course currently followed by the
> OSMF is the least disruptive.

It might be "the least disruptive", but if it doesn't make any sense,
that doesn't make it better. Any judge will laugh at you if you tell
them: Well, we trust the million users we already have and the other 6
billion who can sign on to OSM anonymously more than we trust the
general public.

I don't know what the right way of handling this is, but I do know that
this isn't the right way. It isn't even a step in the right direction.
It is a step towards making the project more closed and burying it in
burocracy. You are ceding ground leading into a morass of legal details
instead of arguing that this data needs to be public for everyone.

Jochen
-- 
Jochen Topf  joc...@remote.org  https://www.jochentopf.com/  +49-351-31778688

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-20 Thread Christoph Hormann
On Wednesday 20 June 2018, Frederik Ramm wrote:
>
> In my view, this is not "cargo cult". If someone comes to us, today,
> and complains that their OSM contributions are being used to stalk
> them, then we cannot even point to a rule that says you cannot do
> this. The stalker is, as far as OSMF is concerned, 100% within their
> rightful use of the data. This is something that needs to stop - even
> if, in the future, it only becomes marginally more difficult for the
> stalker to use OSM data, at least we clearly say that (a) this use is
> not allowed, and (b) the stalker knows it.

I am no legal expert on this matter but as i understand it you do not 
need a contractual agreement to restrict what others can do with data 
you give to them.  You simply put this in the terms of use and the data 
user can either use the data under these terms or not at all.  If 
people use or distribute the data for other purposes they are in 
violation of copyright.

According to my layman understanding of the law here a copyright 
violation is much less troublesome to prove in a legal struggle than a 
contract violation (which would require you to prove the existence of a 
valid contract which is likely almost always impossible practically in 
this case).

> Yes. This also requires the delicate distinction that not everything
> in a .osm file is necessarily under ODbL.

Note you can - based on the current contributor terms - only do this if 
you declare the sensitive metadata to not be data contributed by the 
users because for contributed data the OSMF only has the mandate to use 
or sub-license under an open license.  But if you declare the metadata 
not ODbL you run into the problem that if you combine it with ODbL 
geodata you have to consider the share-alike requirements of the ODbL.  
And at least for history data where correct interpretation of the 
geodata depends on the timestamps you will have a hard time 
interpreting this as a collective database.

-- 
Christoph Hormann
http://www.imagico.de/

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-20 Thread Simon Poole


Am 20.06.2018 um 07:58 schrieb Jochen Topf:
> [ a lot of stuff that is (technically) reasonably easy deleted ]
>
> On Tue, Jun 19, 2018 at 10:54:07PM +0200, Frederik Ramm wrote:
>> 3a. issue guidelines about what you are allowed to do with the user data
>> files,
>> 3b. ensure that everyone who has an OSM account agrees to these
>> guidelines one way or the other,
> This is the part that's not easy and where there is a lot of important
> detail missing. You have to get everybody to agree, which is not going
> to happen. So you have to add some flag to the database telling the
> system whether you are allowed to download or not. You probably have to
> change rules in the future so you have to make this generic, keeping
> information about who clicked through which version of the rules. So you
> are generating more information you are tracking with each user, more
> personal information for which you need consent from the user. 
A) we are not asking for consent, B) yes, we will need an extra flag for
ToU acceptance.

But in any case up to here this is a fairly accurate description of what
the intent is.

> All of
> this needs to be tied in the OAuth stuff and it has to be done in a way
> that 3rd party services using OSM data can ask *their* downstream users
> to identify in the same way which allows OSM to track everybody who uses
> the full OSM data everywhere adding more personal data to keep and to
> explain to users and get permissions from users for.
Nope:
-  anybody using OSM data without the user data is not going to be
affected at all and they don't need to change anything (I've seen
indications that this could be more than 99% of all users downloading
OSM data)
- as has been outlined before, 3rd parties using OSM data with user data
will be acting as independent data controllers and will not be
processing data on behalf of the OSMF (which would require a DPA and all
the associated complications). They will have to make their own
determinations on how to deal with the situation. We will  provide some
support to such entities to help them fulfil their legal obligations
(for example a list of deleted users), but that's it. Naturally the GDPR
applies to such entities completely regardless of what we say, since the
GDPR just happens to be the law. There are still some open questions on
exactly what needs to be done, in particular wrt transfers of data to
countries where the EU hasn't made an equivalence determination, but we
are slowly firming that up.

Simon
> Please stop this nonsense now!
>
> Jochen




signature.asc
Description: OpenPGP digital signature
___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-20 Thread Frederik Ramm
Hi,

On 20.06.2018 08:32, Christoph Hormann wrote:
> Such agreement would not be an agreement to process your own data given 
> by individuals to the OSMF (which is the kind of agreement you would 
> normally expect in the GDPR context).  You probably mean some kind of 
> contractual agreement about what can be done with the data.

Yes. This also requires the delicate distinction that not everything in
a .osm file is necessarily under ODbL.

> But to be 
> honest i don't really see the point in that.  People who download the 
> data can easily create an ad hoc account every time they download data. 

Yes. There would still be a natural person in front of the monitor who
clicks "I agree to be bound by these rules" though.

> The OSMF does not verify the identity of who is behind a user account 
> created. 

And doesn't intend to.

> So what do you expect to gain from such an agreement?  Is 
> there any reason to assume that in a case of such data being released 
> in a way that is not according to the legal requirements by a third 
> party the agreement can be used to avoid legal responsibility for the 
> OSMF it would otherwise need to face?

I think the idea is more: If someone releases, or abuses, personal OSM
data, it is clear that

* this violates OSMF policy and
* someone somewhere in the transport chain from OSM server to
rule-violating use has agreed to rules that they then broke.

In my view, this is not "cargo cult". If someone comes to us, today, and
complains that their OSM contributions are being used to stalk them,
then we cannot even point to a rule that says you cannot do this. The
stalker is, as far as OSMF is concerned, 100% within their rightful use
of the data. This is something that needs to stop - even if, in the
future, it only becomes marginally more difficult for the stalker to use
OSM data, at least we clearly say that (a) this use is not allowed, and
(b) the stalker knows it.

> What i can understand is giving people a simple selection option between 
> 
> [ ] i want to be safe w.r.t. personal data and not being provided 
> potentially sensitive information when logged in.
> [ ] i want to have the possibility to access potentially sensitive data 
> when logged in.
> 
> which would mainly be a service to the user - kind of like the sensitive 
> content switch on youtube.

This is essentially the login. If you are not logged in to OSM then you
will not have access to personal data. If you are logged in, then you
will. We are not currently planning to offer a third way (logged in with
the capability to edit but unable to see personal data).

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09" E008°23'33"

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-20 Thread Frederik Ramm
Hi,

On 20.06.2018 07:58, Jochen Topf wrote:
>> 3a. issue guidelines about what you are allowed to do with the user data
>> files,
>> 3b. ensure that everyone who has an OSM account agrees to these
>> guidelines one way or the other,

> This is the part that's not easy and where there is a lot of important
> detail missing. You have to get everybody to agree, which is not going
> to happen.

I was thinking of simply blocking accounts from logging in until they
have agreed. Or more precisely, they would be able to log in, but only
to see the message telling them they need to "click here".

> You probably have to
> change rules in the future so you have to make this generic, keeping
> information about who clicked through which version of the rules. 

Unsure how useful that would be; would I not want to have everyone "on
the same page" at all times, i.e. having agreed to the same rules?

> So you
> are generating more information you are tracking with each user, more
> personal information for which you need consent from the user. 

As I said, I would simply block all accounts until they have agreed to
the rule. This is not just about being allowed to download data; someone
who edits OSM will also have access to the full user data through the
API and hence agreeing to the rules is a prerequisite for editing too.

> All of
> this needs to be tied in the OAuth stuff and it has to be done in a way
> that 3rd party services using OSM data can ask *their* downstream users
> to identify in the same way which allows OSM to track everybody who uses
> the full OSM data everywhere adding more personal data to keep and to
> explain to users and get permissions from users for.

No, there's a mistake in your reasoning here.

It is true that downstream data distributors like Overpass or the
Geofabrik downloads need to be able to verify whether someone has an OSM
account or not. Pascal has been doing that for ages on his HDYC site,
for example.

But downstream data distibutors do not need to know or store anything
more than that; the Geofabrik download server for example will not even
store the user name of the person who has logged in, just that "whoever
is here has just proven they have an OSM account". So the downstream
distributor can deal with this without processing any personal data. (It
would be possible to extend our OAuth system by a call that would not
even return the user's identity to the caller - currently the identity
is returned to the caller and the caller must then decide whether to
process it or not.)

On the OSM server side it is true that the server can know that "user X
has just gone through the OAuth process at ". But
there's no reason why we would have to keep, store, or process this
information in any way. If we don't process the data then we don't have
to explain, and we don't have to get permission either.

(I don't see why it would be useful to store who has downloaded a full
planet file when.)

> Please stop this nonsense now!

The only alternatives I can see would be:

1. Stop distributing who-did-what-when information to everyone, period.
This is possible but it would create a privileged class inside OSM that
has access to this information, and it would harm the ability of the
community to do QA.

2. Take the view that distributing the data is what we do and tough
luck, you've signed up to it. The LWG has advised against that course of
action. Even if we were to get away with it, we would still have to stop
distributing someone's data once they protest (or at least restrict the
distribution of data), at which point you either have to implement the
whole stuff I outlined in my original post and ensure that some user
data is not publicly available, or (point 1 above) stop distributing
that one person's data altogether.

Given these alternatives, I think the course currently followed by the
OSMF is the least disruptive.

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09" E008°23'33"

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-20 Thread Christoph Hormann
On Tuesday 19 June 2018, Frederik Ramm wrote:
> [...]
> 3b. ensure that everyone who has an OSM account agrees to these
> guidelines one way or the other,

This is the point that looks very fuzzy to me.  Could someone point out 
the legal concept behind this idea for me?

Such agreement would not be an agreement to process your own data given 
by individuals to the OSMF (which is the kind of agreement you would 
normally expect in the GDPR context).  You probably mean some kind of 
contractual agreement about what can be done with the data.  But to be 
honest i don't really see the point in that.  People who download the 
data can easily create an ad hoc account every time they download data.  
The OSMF does not verify the identity of who is behind a user account 
created.  So what do you expect to gain from such an agreement?  Is 
there any reason to assume that in a case of such data being released 
in a way that is not according to the legal requirements by a third 
party the agreement can be used to avoid legal responsibility for the 
OSMF it would otherwise need to face?  To me this looks more like cargo 
cult actionism, doing something that communicates being a serious 
measure at the surface but which is a hollow promise at a closer look.

Note these concerns are not about the idea of restricting access to 
sensitive data to logged in users, it is about requiring some kind of 
agreement from these users.

What i can understand is giving people a simple selection option between 

[ ] i want to be safe w.r.t. personal data and not being provided 
potentially sensitive information when logged in.
[ ] i want to have the possibility to access potentially sensitive data 
when logged in.

which would mainly be a service to the user - kind of like the sensitive 
content switch on youtube.

-- 
Christoph Hormann
http://www.imagico.de/

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] GDPR implementation on planet.osm.org

2018-06-20 Thread Jochen Topf
[ a lot of stuff that is (technically) reasonably easy deleted ]

On Tue, Jun 19, 2018 at 10:54:07PM +0200, Frederik Ramm wrote:
> 3a. issue guidelines about what you are allowed to do with the user data
> files,
> 3b. ensure that everyone who has an OSM account agrees to these
> guidelines one way or the other,

This is the part that's not easy and where there is a lot of important
detail missing. You have to get everybody to agree, which is not going
to happen. So you have to add some flag to the database telling the
system whether you are allowed to download or not. You probably have to
change rules in the future so you have to make this generic, keeping
information about who clicked through which version of the rules. So you
are generating more information you are tracking with each user, more
personal information for which you need consent from the user. All of
this needs to be tied in the OAuth stuff and it has to be done in a way
that 3rd party services using OSM data can ask *their* downstream users
to identify in the same way which allows OSM to track everybody who uses
the full OSM data everywhere adding more personal data to keep and to
explain to users and get permissions from users for.

Please stop this nonsense now!

Jochen
-- 
Jochen Topf  joc...@remote.org  https://www.jochentopf.com/  +49-351-31778688

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


[OSM-dev] GDPR implementation on planet.osm.org

2018-06-19 Thread Frederik Ramm
Hi,

as you probably know, the EU data protection rules compel us to be a bit
less open in handing out personal data to everyone. Following LWG's
analyses and recommendations, the OSMF has decided to implement
restrictions on publishing user names and changeset IDs.

The general plan is to allow everyone "in OSM" (i.e. with an OSM
account) to fully access all data as before (and have a policy that says
you must only use the personal data for OSM purposes), while removing
user names, user IDs, and changeset IDs from the publicly availalbe data
(i.e. what you can get without an OSM account).

This requires changes to the API which I've started to sketch here:
https://wiki.openstreetmap.org/wiki/GDPR/Affected_Services

but this message is about changes to the downloads on
planet.openstreetmap.org. Here's a three phase plan for changing the way
we run planet.openstreetmap.org, and I would like to hear feedback about
the feasibility from users and those familiar with running the site
alike. I haven't run this by the sysadmins so if there are any bloopers
I hope they will be pointed out. (I will put this up on
https://wiki.openstreetmap.org/wiki/GDPR/Planet.osm_Migration and try to
work in any results from discussion here but if you're more comfortable
to edit directly on the Wiki that's fine too.)

Cheers
Frederik


Phase 1 - Introduction of no-userdata files
---

This does not require software development and could start immediately,
but some scripting is required.

1a. set up a new domain for OSM internal data downloads, e.g.
"osm-internal.planet.openstreetmap.org", initially duplicating all data.

Issue: name of domain?
Issue: ironbelly disk usage is at 70%, possible to add space?

1b. modify the planetdump.erb in the planet chef cookbook to generate
versions without user information of all the weekly dumps, in addition
to the versions with user information; have the versions without user
information stored in the old "planet.openstreetmap.org" tree, and the
versions with user information in the new "osm-internal" tree.

Issue: should files have the same names on internal and public site, or
should they be called "planet-with-userdata" and "planet" or something?

1c. modify the replication.cron.erb as follows:

* have osmosis write minutely replication files to the new "internal" tree
* run a shell script after generating the replication files that will
find the newly generated file, pipe it through osmium stripping user
information, and write the result to the old "planet" tree, copying the
state.txt files as needed
* run the osmosis "merge-diff" tasks separately on both trees OR run on
internal tree only and pipe result through osmium as above
* write changeset replication XMLs to the new "internal" tree only

For step 1c, it might make sense to announce a maintenance window
beforehand during which the changes will be made, so that consumers who
rely on user data can stop their replication for a few hours and then
make the switch.

1d. modify planet.openstreetmap.org index pages to point to the internal
page in case people wish to download stuff with user data; place marker
on internal page that these files are with user data.

At the end of phase 1, we will have this situation:

* new changeset diffs only on the "internal" tree
* regular diffs come in two flavours, with and without user data
* planet dumps etc. also come in two flavours
* old files are unchanged
* consumers will automatically get the stuff without user data
* consumers who need user data will have to change their URLs

Phase 2 - Cleaning out old files that contain user data
---

This can be done slowly in the background over the course of however
long it takes:

2a. remove all changeset dumps and changeset diffs from the public tree.
2b. run all .osc, .osm.pbf, and .osm.bz2 files on the public tree
through osmium, scrubbing user data (retain file timestamp if possible)
and re-creating .md5 files where necessary

Phase 3 - Controlling access to files with user data


Once the parallel systems are up and running, we will want to

3a. issue guidelines about what you are allowed to do with the user data
files,
3b. ensure that everyone who has an OSM account agrees to these
guidelines one way or the other,
3c. start requiring an OSM login for all downloads from the internal,
"with userdata" tree.

One possible technical solution for 3c is
https://github.com/geofabrik/sendfile_osm_oauth_protector which also
comes with a guide for users on how to run it in a scripted setup.


-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09" E008°23'33"

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev