Jelmer Vernooij wrote: > Hi Michael, Thanks for the reply!
> On Fri, 2010-02-05 at 15:58 +1300, Michael Hudson wrote: >> We want to make code imports, or at least the ones done with a foreign >> branch plugin, import incrementally. This will worm around some >> resource leaks somewhere in the import plugin or bzr and allow us to >> import really large repos like linux or firefox, but also will make >> scheduling fairer and reduce the damage done by a network blip. >> >> This requires some infrastructure work to support an import status of >> "partially successful" and so on, but I know how to do that. The part >> I'm a bit less sure of is how to do the "only import $N revisions" bit. >> >> One way would be to not try too hard, and import only $N _mainline_ >> revisions each time. I think code like this could do that: >> >> local_branch = ... >> foreign_branch = ... >> local_revno = local_branch.revno() >> foreign_revno = foreign_branch.revno() >> target_revno = max(local_revno + $N, foreign_revno) >> target_revid = foreign_branch.get_revid(target_revno) >> local_branch.pull(foreign_branch, stop_revision=target_revid) >> if target_revno == foreign_revno: >> return SUCCESS >> else: >> return PARTIAL_SUCCESS > >> What I don't know is if this will be very efficient at all; does >> get_revid() on a mercurial or svn or git branch perform acceptably? > bzr-svn branches have this call and it's quite cheap, but it can be very > expensive for bzr-git and bzr-hg branches because we need to fetch all > data before we can lookup the revno. At the moment, we don't cache the > fetched data anywhere so we end up fetching it twice - once to lookup > the revid and once to actually import it. Right, that's what I was afraid of. >> It's also a bit lame in that it would be better to only import $N >> _revisions_ at a time, not mainline revisions. But I don't know how to >> do that. The above sketch might be good enough in any case. > The plugins should (with a trivial amount of work) be able to support an > optional argument to only convert approximately X revisions. I think > this is probably a simpler and faster solution than using get_revid(), > and it will also allow us to only import only X real revisions rather > than just X mainline revisions. That would be great. When can this be done by? :-) >> The other thing that should be done is changing our bzr-git importer to >> preserve the git pack files between partial imports, by changing bzr-git >> to put them in a predictable location and then doing some work in the >> importer to preserve them. I think I'd rather Jelmer look at this part, >> or at least provide me with very detailed instructions ... > Is this a requirement before the incremental imports? It's not strictly a requirement, but it means that for the kernel, we'll transfer 55000 revisions for the first partial import, then 54000 for the second then 53000, .... totaling to rather a lot. Tim thinks this is more important than me, it seems. Cheers, mwh _______________________________________________ Mailing list: https://launchpad.net/~launchpad-dev Post to : [email protected] Unsubscribe : https://launchpad.net/~launchpad-dev More help : https://help.launchpad.net/ListHelp

