Re: Commit loss prevention

Kohsuke Kawaguchi Thu, 14 Nov 2013 10:25:05 -0800

On 11/13/2013 11:58 PM, Luca Milanesio wrote:

We need to make some tests on the scalability of the events API because of:
1) need to monitor over 1000 repos (one call per repo ? one call for all ?)
2) by monitoring the entire jenkinsci org, 300 events could be not enough in 
case of catastrophic events

The good news is that the push that removes/alters refs also take time.I have the notification e-mail from your push to 186 repos, and it spansover an hour.

So I'm hoping that polling 300 events every minute would cover us prettywell. And like you say, a webhook can help us reduce this window downeven further.



There's another reason I'm optimistic about this scheme.

Suppose you are maliciously trying to cause data loss. If we areregularly recording refs, you have to mount an attack immediately aftersome commits go in so as to overwhelm the 300 event buffer, then keepthat saturation going so that your ref updates/removals will also bedropped from the event buffer. And even with this much effort you canonly cause the data loss of the commits that went in right before yours.

So I think it makes the attack so ineffective that we can tolerate thatrisk, and I find it unlikely that no accidents will look like this.

Working at webhook level ? I'll investigate further about the reliability / 
scalability of the API (on a series of *test* repo *OUTSIDE* the Jenkins CI 
organisation)

Luca.

On 13 Nov 2013, at 18:56, Kohsuke Kawaguchi <[email protected]> wrote:


On 11/11/2013 11:05 PM, Luca Milanesio wrote:

Seems a very good idea, it is basically a remote audit trail.

The only concern is the throttling on the GitHub API: it would be better
then to do the scripting on a local mirror of the GitHub repos. When you
receive a forced update you do have anyway all the previous commits and
the full reflog.


With respect to throttling, the events API is designed for polling [1], so we 
just need to poll the events for the entire jenkinsci org [2] and we'll have 
the whole history.

We already do an equivalent of local mirrors of the GitHub repos in 
http://git.jenkins-ci.org/. The problem is that reflogs do not record remote 
ref updates, so it will not protect against accidental ref manipulations.

It does help however for the purpose of retaining commit objects, so we need to 
keep this.

However as you said by being triggered via web hook the number of API
calls can be reduced to the minimum.

I would submit a proposal to the Git mailing list of a "fetch by SHA1"
which is a missing feature in Git IMHO.


My recollection is that this was intentional for the security reason, so that 
if a push is made accidentally and if it's removed, then those objects 
shouldn't be accessible.

I think what's useful and safe is to allow us to create a ref remotely on an 
object that doesn't exist locally. Again, the transport level protocol allows 
this, so it'd be nice to expose this.

Thanks to everyone including GitHub for the help and cooperation in
getting this sorted out !!


[1] http://developer.github.com/v3/activity/events/
[2] https://api.github.com/orgs/jenkinsci/events


Luca
---------
Sent from my iPhone
Luca Milanesio
Skype: lucamilanesio


On 12 Nov 2013, at 06:25, Kohsuke Kawaguchi <[email protected]
<mailto:[email protected]>> wrote:

Now that the commits have been recovered and things are almost back to
normal, I think it's time to think about how to prevent this kind of
incidents in the future.

Our open commit access policy was partly made possible by the idea
that any bad commits can be always rolled back. But where I failed to
think through was that the changes to refs aren't by themselves
version controlled, and so it is possible to lose commits by incorrect
ref manipulation, such as "git push -f", or by deleting a branch.

I still feel strongly that we maintain the open commit access policy.
This is how we've been operating for the longest time, and it's also
because otherwise adding/removing developers to repositories would be
prohibitively tedious.

So my proposal is to write a little program that uses GitHub events
API to keep track of push activities in our repositories. For every
update to a ref in the repository, we can record the timestamp, SHA1
before and after, the user ID. We can maintain a text file for every
ref in every repository, and the program can append lines to it. In
other words, effectively recreate server-side reflog outside GitHub.

The program should also fetch commits, so that it has a local copy for
every commit that ever landed on our repositories. Doing this also
allows the program to detect non fast-forward. It should warn us in
that situation, plus it will create a ref on the commit locally to
prevent it from getting lost.

We can then make these repositories accessible via rsync to encourage
people to mirror them for backup, or we can make them publicly
accessible by hosting them on GitHub as well, although the latter
could be confusing.

WIth a scheme like this, pushes can be safely recorded within a minute
or so (and this number can go down even further if we use webhooks.)
If a data loss occurs before the program gets to record newly pushed
commits, we should still be able to record who pushed afterward to
identify who has the commits that were lost. With such a small time
window between the push and the record, the number of such lost
commits should be low enough such that we can recover them manually.

--

Kohsuke Kawaguchi

--
You received this message because you are subscribed to the Google
Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected]
<mailto:[email protected]>.
For more options, visit https://groups.google.com/groups/opt_out.


--
You received this message because you are subscribed to the Google
Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.



--
Kohsuke Kawaguchi | CloudBees, Inc. | http://cloudbees.com/
Try Jenkins Enterprise, our professional version of Jenkins

--
You received this message because you are subscribed to the Google Groups "Jenkins 
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.



--
Kohsuke Kawaguchi | CloudBees, Inc. | http://cloudbees.com/
Try Jenkins Enterprise, our professional version of Jenkins

--
You received this message because you are subscribed to the Google Groups "Jenkins 
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Re: Commit loss prevention

Reply via email to