Konstantin Ryabitsev <[email protected]> wrote:
> Good day:
>
> We've had a few requests to mirror public-inbox archives that originate on
> other systems so they can also be searchable and viewable via lore.kernel.org.
> I've been dragging my feet on these requests, because they are a potential
> liability in terms of GDPR compliance.
I just tried using `git replace' for the first time:
git replace --edit $BLOB_OID
And all the `git cat-file --batch' invocations appear to work as
if the original blob contents never existed. Of course,
reindexing could be necessary, as would changing the git config
to ensure `git fetch' doesn't destroy elements in the
refs/replace/ namespace.
git clones/fetch still include both the original and replacement
blob; though (favoring the replacement); so perhaps `git replace'
isn't a fit...
Then; Worse case would be to temporarily remove the mirror; or
forking it (via -edit/-purge + subscribe) until upstream cleans
it up.
lei's v2 inbox output could be used as a subscription mechanism.
> If we are merely mirroring the archive from some other location, then there
> should be a clear indication of the origin of the data and contact information
> of the maintainer of the remote archive where someone could send requests for
> any data removal. It's best if this is visible both via the web view and in
> raw messages retrieved via our service, e.g. via an "X-Archive-Origin:" header
> or something similar.
I sometimes use the $INBOX_DIR/description file for that and it
affects WWW and NNTP, but not IMAP/POP3. I'm not sure if I want
to reintroduce header injection in case there's some conflict
with DKIM or other signature mechanisms[1]
> Any thoughts on this issue?
IANAL, obviously...
[1]
https://public-inbox.org/meta/[email protected]/