Re: Git ~unusable on slow lines :,'C

Shawn Pearce Tue, 09 Oct 2012 08:58:34 -0700

On Tue, Oct 9, 2012 at 7:06 AM, Marcel Partap <[email protected]> wrote:
>>> Bam, the server kicked me off after taking to long to sync my copy.
>> This is unrelated to git. The HTTP server's configuration is too
>> impatient.
> Yes. How does that mean it is unrelated to git?


It means its out of our control, we cannot modify the HTTP server's
configuration to have a longer timeout. We can recommend that the
timeout be increased, but as you point out the admins may not do that.

>>> - git fetch should show the total amount of data it is about to
>>> transfer!
>> It can't, because it doesn't know.
> The server side doesn't know at how much the objects *it just repacked
> for transfer* weigh in?

Actually it does. Its just not used here. What value is that to you?
You asked for the repository. If you know its size is going to be ~105
MiB you have two choices... continue to get the repository you asked
for, or disconnect and give up. Either way the size doesn't help you.
It would require a protocol modification to send a size estimate down
to the client before the data in order to give the client a better
progress meter than the object count (allowing it instead to track by
bytes received). But this has been seen as not very useful or
worthwhile since it doesn't really help anyone do anything better. So
why change the protocol?

>> You asked for the current state of the repository, and that's what its
>> giving you.
> And instead, I would rather like to ask for the next 500 commits. No way
> to do it.

No, there isn't. Git assumes that once it has commit X, all versions
that predate X are already on the local workstation. This is a
fundamental assumption that the entire protocol relies on. It is not
trivial to change. We have been through this many times on the mailing
list, please search the archives for "resumable clone".

>> The timeout has nothing to do with git, if you can't
>> convince the admins to increase it, you can try using another transport
>> which doesn't suffer from HTTP, as it's most likely an anti-DoS measure.
> See, I probably can't convince the admins to drop their anti-dos measures.
> And they (drupal.org admins) probably will not change their allowed
> protocol policies.

Then if they are hosting really big repositories that are hard for
their contributors to obtain, they should take the time to write a
script that periodically creates a bundle file for each repository
using `git bundle create repo.bundle --all`. They can host these
bundle files in any file transport service like HTTP or BitTorrent,
and users can download and resume these using normal HTTP download
tools. Once you have a bundle file locally, you can clone from it with
modern Git with `git clone $(pwd)/repo.bundle` to initialize the
repository.

This is currently the best way to support resumable clone. The repo
will be stale by whatever time has elapsed since the bundle file was
created. But then Git can do an incremental fetch to catch up, and
this transfer size should be limited to the progress made since the
bundle was made. If bundles are made once per month or after each
major release its usually a manageable delta.

> It is not only a configuration issue for one particular server. Git in
> general is hardly usable on slow lines because
> - it doesn't show the volume of data that is to be downloaded!

If it did show you, what would you do? Declare defeat before it even
starts to download and give up and start a thread about how Git
requires too much bandwidth?

Have you tried to shallow clone the repository in question?

> - it doesn't allow the user to sync up in steps the circumstances will
> allow to succeed.

Sadly, this is quite true. :-(
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Git ~unusable on slow lines :,'C

Reply via email to