Hi Boris,

I was one of the people asking off-list because I have a concern with 
encryption as a technology for anonymizing data. It immediately raises a red 
flag for me because it allows to de-anonymize the data. Thus, I would like to 
see use of data masking techniques such as hashing instead of encryption. To be 
more clear, I find it suspicious why reversible anonymization must be used in 
the first place.

Can you also be more specific about what public data and which API endpoints 
you are going to use?

I assume it's anything that is public in Git already, which makes this 
discussion obsolete as everything is already public. But I want to confirm that 
non of the API endpoints require authentication to get data you wouldn't get 
without authentication.

Best,
Gunnar

-- 
Gunnar Wagenknecht
[email protected], http://guw.io/






> On Apr 26, 2018, at 07:18, Boris Baldassari <[email protected]> wrote:
> 
> Hello good people,
> 
> In the context of the Crossminer research project [1], we plan to publish a 
> number of datasets to the public and for the research community. This 
> includes public data from the Eclipse forge (i.e. data is fetched from public 
> data sources and APIs only), and we want to setup an anonymisation process 
> that would:
> 
> * Efficiently and safely remove all personally identifiable data -- we don't 
> want to help spammers or malicious harvesters, and
> * Still provide valuable information and datasets for the research community 
> -- e.g. ability to identify identical IDs across sources without specifically 
> knowing them.
> 
> The basic idea is to simply replace all identifiers with asymmetrically 
> encrypted strings, so all IDs have the same ciphered result. RSA is used for 
> the encryption, and the private key is thrown away once the encoding is done, 
> making it impossible (according to common encryption standards) to retrieve 
> the original string.
> 
> A prototype has already been published [2, 3] and we would like to ask people 
> to review it so as to make sure that our privacy-preserving mechanism is safe.
> 
> Any feedback, concern or contribution is warmly welcome.
> 
> [1] https://www.crossminer.org/
> [2] https://github.com/borisbaldassari/data-anonymiser
> [3] https://borisbaldassari.github.io/data-anonymiser/
> 
> Thanks in advance, have a wonderful week!
> 
> --
> boris
> _______________________________________________
> cross-project-issues-dev mailing list
> [email protected]
> To change your delivery options, retrieve your password, or unsubscribe from 
> this list, visit
> https://dev.eclipse.org/mailman/listinfo/cross-project-issues-dev

_______________________________________________
cross-project-issues-dev mailing list
[email protected]
To change your delivery options, retrieve your password, or unsubscribe from 
this list, visit
https://dev.eclipse.org/mailman/listinfo/cross-project-issues-dev

Reply via email to