----- "Toshio Kuratomi" <[EMAIL PROTECTED]> wrote:

<snip>
> > 
> Getting koji data munged and transferred may be a problem as it is
> just
> so darn big.  If we don't have to make changes to the data in koji,
> just
> get it distributed, then we could give access to a backup... but
> that's
> still a lot of information to transfer.

We would only need a portion of the data.  Ideally everything since the last 
supported version of each distribution (or one after so we get obsolete data to 
test against) but in reality the last month of activity should be suitable.

> pkgdb, fas, and bodhi are relatively small.
> 
> fas is where we'd have our major security problems.  We can't give
> the
> information out unmunged.  I've munged it before, though, so it's
> doable.  How strict we need to be is an issue, though.  If we remove
> all
> the identifying information in the people table except for the
> userid,
> is that sufficient?  *Note: We probably also need to munge data in
> the
> configs table.

As long as we randomly generate data for that (well username at least).  Note 
that UID's are easily mapped back to usernames so you might want randomize 
that.  Also I believe packagedb and bodhi use usernames as the key instead of 
UID's so those would have to match accounts in the munged FAS db.  I would 
suggest generating a list of names from a dictionary and using that list to 
randomize names in the other services.  Of course the names need to correspond 
to group permissions so some logic would be needed to make sure records 
associated with a give name are valid.  However having the ability to recreate 
the associated user names may not be an issue since all of that data is public. 
 More importantly we need to make sure we aren't giving out addresses, phone 
numbers, password hashes and other such keys.

> pkgdb and bodhi don't have information that is privacy policy
> sensitive.
>  (Which doesn't mean that some users won't like it... just that I
> think
> we're covered.)

Mike's suggestion of running it by legal sounds like the best route. 
 
--
John (J5) Palmieri
Software Engineer
Red Hat, Inc.

_______________________________________________
Fedora-infrastructure-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/fedora-infrastructure-list

Reply via email to