Ævar Arnfjörð Bjarmason <[email protected]> wrote:
> On Wed, Jun 21 2017, Tim Hutt jotted:
>
> > Hi,
> >
> > Currently if you want to monitor a repository for changes there are
> > three options:
> >
> > * Polling - run a script to check for updates every 60 seconds.
> > * Server side hooks
> > * Web hooks (on Github, Bitbucket etc.)
> >
> > Unfortunately for many (most?) cases server-side hooks and web hooks
> > are not suitable. They require you to both have admin access to the
> > repo and have a public server available to push updates to. That is a
> > huge faff when all I want to do is run some local code when a repo is
> > updated (e.g. play a sound).
Yeah, it kinda sucks that way.
Currently, for one of my public-inbox mirrors which has ssh
access to the primary server on public-inbox.org, I have:
#!/bin/sh
while true
do
# GNU tail(1) uses inotify to avoid polling on Linux
ssh public-inbox.org tail -F /path/to/git-vger.git/info/refs | \
while read sha1 ref
do
for GIT_DIR in git-vger.git
do
export GIT_DIR
git fetch || continue
git update-server-info
public-inbox-index # update Xapian index
done
done
done
It's not perfect as it requires multiple processes on the
server, but it's better than polling for my limited use.
> > Currently people resort to polling
> > (https://stackoverflow.com/a/5199111/265521) which is just ugly. I
> > would like to propose that there should be a forth option that uses a
> > persistent connection to monitor the repo. It would be used something
> > like this:
> >
> > git watch https://github.com/git/git.git
> >
> > or
> >
> > git watch [email protected]:git/git.git
> >
> > It would then print simple messages to stdout. The complexity of what
> > it prints is up for debate, - it could be something as simple as
> > "PUSH\n", or it could include more information, e.g. JSON-encoded
> > information about the commits. I'd be happy with just "PUSH\n" though.
>
> Insofar as this could be implemented in some standard way in Git it's
> likely to have a large overlap with the "protocol v2" that keeps coming
> up here on-list. You might want to search for past threads discussing
> that.
Yeah, it hasn't been a priority for me, either...
> > In terms of implementation, the HTTP transport could use Server-Sent
> > Events, and the SSH transport can pretty much do whatever so that
> > should be easy.
>
> In case you didn't know, any of the non-trivially sized git hosting
> providers (e.g. github, gitlab) provide you access over ssh, but you
> can't just run any arbitrary command, it's a tiny set of whitelisted
> commands. See the "git-shell" manual page (github doesn't use that exact
> software, but something similar).
>
> But overall, it would be nice to have some rationale for this approach
> other than that you think polling is ugly. There's a lot of advantages
> to polling for something you don't need near-instantly, e.g. imagine how
> many active connections a site like GitHub would need to handle if
> something like this became widely used, that's in a lot of ways harder
> to scale and load balance than just having clients that poll something
> that's trivially cached as static content.
Polling becomes more expensive with TLS and high-latency
connections, and also increases power consumption if done
frequently for redundancy purposes.
I've long wanted to do something better to allow others to keep
public-inbox mirrors up-to-date. Having only 64-128 bytes of
overhead per userspace per-connection should be totally doable
based on my experience working on cmogstored; at which point
port exhaustion will become the limiting factor (or TLS overhead
for HTTPS).
But perhaps a cheaper option might be the traditional email/IRC
notification and having a client-side process watch for that
before fetching.