Re: Git Merge 2013 Conference, Berlin
Scott Chacon wrote: We're starting off in Berlin, May 9-11th. GitHub has secured conference space at the Radisson Blu Berlin for those days. I have a It's a pity that you did not announce the event on the msysgit mailing list, too, which is why I totally missed it until today, the event being almost over. This is especially sad for me as I'm living in Berlin, so it would have been easy for me to attend, and as I had offered to help you organizing the event when you were still looking for a location last year. I apologize, I will try to put events on that list as well in the future. Unfortunately, read this mail too late. If you're going to organize something more regularily (w/ up to 25 people), we perhaps could to it in office-2.0. Just let me know if you're interested. cu -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Merge conflicts with version numbers in release branches
Thomas Koch wrote: it's a common problem[1,2,3] in Maven (Java) projects and probably in other environments too: You have the version number of your project written in the pom.xml. When one merges changes upwards from the maint branche to master, the version numbers in maint and master are different and cause a merge conflict. My advice: dont merge directly, but rebase to latest master. Maybe even rebase incrementally (eg. git rebase master~100 git rebase master~99 ...). This heavily reduces the chance of conflicts that need to be resolved manually. I'm a big fan of topic branches. For example, we have some bug #1234 in the maintenance release. Fork off at latest maint, lets call the branch 1234_somewhat. Now do your bugfixing, testing, etc. When thats done, rebase on latest maint (in case maint moved further) and merge it into maint. Now rebase the 1234_somewhat branch onto master, do tests etc and finally merge into it. (note: all merges here will be fast-forward, IOW: pure append operations). Of course, all of this wont make the conflicts on the version number change go away magically, but at least it will be more clear while resolving it. If you always do the version number changes in some separate commit, which has some specially formatted message (eg. 'Release: 1.2.4') you could hack some some little filter-branch magic, which automatically kicks off these commits before rebasing. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git-p4: Importing a Git repository into Perforce without rebasing
Hi, snip perhaps you should give Perfoce's git-bridge a try. cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Mobile: +49 (151) 27565287 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Auto-repo-repair
Hi, I still think that it would make the most sense to do the following (if you insist on some sort of automated repair): (1) Fetch a good clone (or clones) into a temporary directory; (2) Cannibalize the objects from it (them); (3) Re-run git fsck and check for still-missing / unreachable items; (4) IF THE RESULT OF (3) IS ACCEPTABLE, run git gc to clean up the mess, discard / merge duplicate objects, and fix up the packfiles. It is step (4) that requires the most user interaction. I could see building up a shell script that does all but (4) nearly automatically. None of this requires modifying Git itself. Well, I'd like to have some really automatic mode, which does everything ondemand. Once we've got this not just for repair, but also to support quick partial clones that fetch more objects when required. In fact, finally, I'd like to have some storage cloud where data automatically gets replicated to nodes which need the data, not just for VCS, but other purposes (backup, filestore, etc) too. But before inventing someting completely new (reinventing much of the wheel), I'd like to investigate whether git can be extended into this direction step by step. cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Auto-repo-repair
How would the broken repository be sure of what it is missing to request it from the other side? fsck will find missing objects. And what about the objects referred to by objects that are missing? Will be fetched after multiple iterations. We could even introduce some 'fsck --autorepair' mode, which triggers it to fetch any missing object from its remotes. Maybe even introduce a concept of peer object stores, which (a bit like alternates) are asked for objects that arent locally availabe - that could be even a plain venti store. cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Auto-repo-repair
How would the broken repository be sure of what it is missing to request it from the other side? fsck will find missing objects. -- -Drew Northup -- As opposed to vegetable or mineral error? -John Pescatore, SANS NewsBites Vol. 12 Num. 59 -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Local clones aka forks disk size optimization
Hi, That's not the only problem. I believe you only get the savings when the main repo gets the commits first. Which is probably ok most of the time but it's worth mentioning. Well, the saving will just be deferred to the point where the commit finally went to the main repo and downstreams are gc'ed. hmm, distributed GC is a tricky problem. Except for one little issue (see other thread, subject line cloning a namespace downloads all the objects), namespaces appear to do everything we want in terms of the typical use cases for alternates, and/or 'git clone -l', at least on the server side. hmm, not sure about the actual internals, but that namespace filtering should work in a way that local clone should never see (or consider) remote refs that are outside of the requested namespace. Perhaps that should be handled entirely on server side, so all called commands treat these refs as nonexisting. By the way: what happens if one tries to clone from an broken repo (which has several refs pointing to nonexisting objects ? cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Auto-repo-repair
Hi, You can't reliably just grab the broken objects, because most transports don't support grabbing arbitrary objects (you can do it if you have shell access to a known-good repository, but it's not automated). can we introduce a new or extend existing transports to support that ? cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Auto-repo-repair
Hi folks, suppose the following scenario: I've broken some repo (missing objects), eg by messing something up w/ alternates, broken filesystem, or whatever. And I've got a bunch of remotes which (together) contain all of the lost objects. Now I'd like to run some $magic_command which automatically fetches all the missing objects and so repair my local repo. Is this already possible right now ? thx -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Local clones aka forks disk size optimization
Provide one main clone which is bare, pulls automatically, and is there to stay (no pruning), so that all others can use that as a reliable alternates source. The problem here, IMHO, is the assumption, that the main repo will never be cleaned up. But what to do if you dont wanna let it grow forever ? hmm, distributed GC is a tricky problem. maybe it could be easier having two kind of alternates: a) classical: gc+friends will drop local objects that are already there b) fallback: normal operations fetch objects if not accessible from anywhere else, but gc+friends do not skip objects from there. And extend prune machinery to put some backup of the dropped objects to some separate store. This way we could use some kind of rotating archive: * GC'ed objects will be stored in the backup repo for some while * there are multiple active (rotating) backups kept for some time, each cycle, only the oldest one is dropped (and maybe objects in a newer backup are removed from the older ones) * downstream repos must be synced often enough, so removed objects are fetched back from the backups early enough You could see this as some heap: * the currently active objects (directly referenced) are always on the top * once they're not referenced, they sink a lever deeper * when the're referenced again, they immediately jump up to the top * at some point in time unreferenced objects sink too deep that they're dropped completely cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bizarre problem cloning repo from Codeplex
Their webserver seems to be configured quite restrictively (eg. cannot access files like 'packed-refs'). Probably it just doesn't exist. Aren' these files requied ? cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bare vs non-bare 1.7 then =1.7 ?
When experimenting in order to train some colleagues, I saw that If I clone a repository, I couldn't push to it because it was a non-bare one. Searchin for some explanations, I found this ressource: http://www.bitflop.com/document/111 That's just a precaution (technically it's not necessary, just stops you from doing some dumb things). Suppose the following scenario: * non-bare repository A, with branch 'master' currently checked out. * clone B - somebody's working on branch 'master' (which was forked from A's master) * on A, somebody did some local changes * meanwhile somebody pushes the branch 'master' from B to A * after that, on A, new commit to 'master'. Weird things can happen, eg. the changes coming from B completely reverted by the new commit in A. Unless nobody pushes to the branch currently checked and later somebody doing local changes after that, there shouldn't be any real technical problem. But then, you most likely wont need an worktree anyways. Wait, there *is* an usecase for such things, deploying trees (eg. webapps) some server: * application is developed in git * the final production-system tree is maintained in certian branch * a post-update hook acts on a specific production branch and does something like git checkout --detach treeish cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Support for a series of patches, i.e. patchset or changeset?
snip yet another idea: you coud always put your patchsets into separate branches, rebase them ontop target branch before merging, and then do an non-ff-merge, which will make the history look like: * merged origin/feature_foo |\ | * first preparation fo feature foo | * part a | * part b |/ * merged origin/bugfix_blah |\ | * fixing bug blah |/ * cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Workflow for templates?
refs/heads/* to locally refs/remotes/origin/*, and remote-side's refs/tags/* to locally refs/tags (without overwriting existing tag references) #3: git remote update origin -- do the actual syncing from remote origin, get the remote ref list, download all yet objects (that are required for the refs to be synced) and adds/updates the refs into the according target namespaces (BTW: if a branch was removed on remote side, the local copy in refs/remotes/remote-name/* wont be deleted - you'll need to call git remote prune remote-name for that) #4: git checkout origin/master -b master -- copies the current refs/remotes/origin/master ref to refs/heads/master and checks out that new local branch (IOW: sets the refs/HEAD symbolic ref to refs/heads/master and copies index and working tree from the head commit) Branches are something completely different: Logically, a branch is a history of commits with parent-child-relationship (mathematically spoken, it's an directed acyclic graph): each commit may have a variable number of parent commits. Technically, what we usally call branch is in fact an name (reference in refs/heads/* namespace) which point at the head commit of that local branch. When you do git commit, it creates a new commit object from the index, adds some metadata (eg. your commit message) and sets the current branch reference (usually that one where the symbolic reference refs/HEAD points to) to the new commit object's SHA-key. IOW: you add a new object in front of the DAG and move the pointer one step forward in the line. When you do a merge (no matter if the source is remote or local - it just needs to be an locally available object), there're essentially two things that can happen: a) your source is an direct descendant of the target branch (IOW: the target's current head commit appears somewhere in the source's history), it will just move the current branch forward to the merge source (moves the head pointer and updates index and worktree) this is called fast-forward (in fact, it the fastest kind of merge) b) your source is not direct descendant: source tree will be actually merged into index/worktree, possibly make break when there're conflicts to be resolved manually, and create a new commit containing the current (now merged) index and two parent poiters, to source and to previous merge target. Now what is rebase ? A rebase rewrites history in various ways (in fact, you can do a lot more things than just simple rebasing, eg. edit or drop older commits, etc). For example 'git rebase origin/master' will look for the latest common ancestor of both the current and the target treeish (eg. refs/remotes/master), start from that tree'ish and apply the changes that happend from the last common ancestor until your current branch head ontop of that treeish, (possibly asking the user to manually resolve some conflicts), and then replaces the current branch head by the final head. As it changes history, it should be used wisely. A common problem with using rebase and public branches is: * upstream changes history (eg. because he rebased onto his upstream) * downstream (per default) merges this upstream into his branch -- git will see two entirely different branches get merged, so there's some good change of nasty conflicts, and history will easily get really ugly So, if you do rebase your public branch, downstreams should also do so (rebase their local branches ontop of your public branch instead of merging yours into theirs). By the way: there are several more kinds of rebases, which are very interesting for complex or sophisticated workflows, eg: * --ontop rebase: instead of letting git find out the starting point of commit sequence to apply on target treeish, you'll define it explicitly (eg. if you want it to forget about things previous to the starting treeish). * interactive rebase: a) is able to reconstruct merges b) allows to cut into the sequence and change, drop or add new commits These operations are very useful for cleaning up the history, especially with things like topic-branch workflow (eg. if you originally have some hackish and unclean commits and you wanna put an clean and self-consistant one into your mainline instead). cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Workflow for templates?
Let me ask a different question: What is wrong with cherry-picking downstream changes to your upstream branch? Without rebasing it to downstream. Naah, dont rebase the upstream ontop of downstream - this doenst make any sense (yeah, my devs sometimes doing exatly this wong ;-o). Instead, as you just said, cherry-pick the good commits into your upstream branch and rebase your downstreams ontop of that. (doesnt make any difference if this is done by different people or in different administrative domains). That might mean there is a rather useless merge downstream later on, but that's the price you pay for not doing the change in a development branch. That's one of the things rebase is for: not having your history filled up with merges at all, but always have your local cutomizations added ontop of the current upstream. By the way: I'm also using this hierachy for package maintenance to different target distros: upstream branch | | upstream release tag X.Y.Z | \ / bugfix branch (maint-X-Y-Z) = general (eg. distro-agnostig) fixes go here | |- maintenance release tag X.Y.Z.A | \ / dist branch (mydist-X-Y-Z) = distro-specific customizations (eg. | packaging control files, etc) go here |-- dist package release tags X.Y.Z.A-B Usually I do quick hotfixes in the dist branch (and assigning new dist version number), then copy the dist branch into some topic-branch, rebase into latest bugfix branch, cherry-pick the interesting commit(s) into the bugfix branch. When I do a new bugfix release (from by bugfix branch), I rebase the dist branch ontop the latest bugfix release tag, fix dist-package version numbers and run the dist-specific build and testing pipeline. Here's some example for it: https://github.com/vnc-biz/redmine-core cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bizarre problem cloning repo from Codeplex
I'm trying to clone the following repository from Codeplex: https://git01.codeplex.com/entityframework.git git downloads all the objects, creates the directory entityframework, then displays error: RPC failed; result=56, HTTP code = 200 and immediately deletes the directory. I can clone other HTTPS repos with this git installation, for example from Bitbucket and Github. It's git 1.7.10.4 on Debian. reproduced it on Ubuntu precise, git-1.7.9.5 When starting with an empty repo, adding the url as remote and calling git remote update origin: Fetching origin WARNING: gnome-keyring:: couldn't connect to: /tmp/keyring-5cWq1d/pkcs11: No such file or directory remote: Counting objects: 21339, done. remote: Compressing objects: 100% (3778/3778), done. remote: Total 21339 (delta 17180), reused 21339 (delta 17180) Receiving objects: 100% (21339/21339), 11.24 MiB | 1.04 MiB/s, done. error: RPC failed; result=56, HTTP code = 200 Resolving deltas: 100% (17180/17180), done. error: Could not fetch origin But: refs/remotes/origin/master is added and looks sane (git fsck shows no errors). Subsequent 'git remote update' calls look good: Fetching origin Even after manually removing the ref and re-running update, everything look fine: Fetching origin WARNING: gnome-keyring:: couldn't connect to: /tmp/keyring-5cWq1d/pkcs11: No such file or directory From https://git01.codeplex.com/entityframework * [new branch] master - origin/master Their webserver seems to be configured quite restrictively (eg. cannot access files like 'packed-refs'). Is there a way to trace the actual HTTP calls ? cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git-svn with ignore-paths misses/skips some revisions during fetch
The problem is that the 'ignore-paths' approach sometimes misses commits during a fetch, and then at some later time will realize it and squash those changes onto some other, unrelated commit. (I've never seen this happen with the per-subdir 'fetch' approach.) Here are three commits in SVN: Could it be that certain files spent parts of their historical lifetime inside the ignored paths ? cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Workflow for templates?
Hi, snip I'd suggest a 3 level branch hierachy (IOW: the lower level is rebased ontop of the next higher level): * #0: upstream branch * #1: generic local maintenance branch * #2: per-instance cutomization branches Normal additions go to the lowest level #2. When you've got some generic commit, you propagate it to the next level (cherry-pick) and rebase layer #2 ontop of it. Now you can send your layer #1 to upstream for integration. When upstream updated his branch, you simply rebase #1 ontop of it, do your checks etc, then proceed to rebasing #3. You could also introduce more intermediate layers (eg when you've got different groups of similar instance that share certain changes) cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: filter-branch IO optimization
Hi, The usual advice is use an index-filter instead. It's *much* faster than a tree filter. However: I've tried the last example from git-filter-branch manpage, but failed. Seems like the GIT_INDEX_FILE env variable doesnt get honoured by git-update-index, no index.new file created, and so mv call fails. My second try (as index-filter command) was: git ls-files -s ../_INDEX_TMP cat ../_INDEX_TMP | sed s-\t\*-addons/- | git update-index --index-info rm -f ../_INDEX_TMP It works fine in the worktree (i see files renamed in the index), but no success when running it as --index-filter. Seems the index file isn't used at all (or some completely different one). By the way, inside the index filter, GIT_INDEX_FILTER here is /home/devel/vnc/openerp/workspace/pkg/openerp-extra-bundle.git/.git-rewrite/t/../index Obviously a different (temporary) index file, while many examples on the web, suggesting to use commands like 'git add --cached' or 'git rm --cached' _without_ passing GIT_INDEX_FILTER variable. Could there be some bug that this variable isn't honored properly everywhere ? -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: filter-branch IO optimization
snip Did some more experiments, and it seems that missing index file isn't automatically created. When I instead copy the original index file to the temporary location, it runs well. But I still have to wait for the final result to check whether it really overwrites the whole index or just adds new files. cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: filter-branch IO optimization
Hi folks, now finally managed the index-filter part. The main problem, IIRC, was that git-update-index didn't automatically create an empty index, so I needed to explicitly copy in (manually created it with an empty repo). My current filter code is: if [ ! $GIT_AUTHOR_EMAIL ] [ ! $GIT_COMMITTER_EMAIL ]; then export GIT_AUTHOR_EMAIL=nob...@none.org export GIT_COMMITTER_NAME=nob...@none.org elif [ ! $GIT_AUTHOR_EMAIL ]; then export GIT_AUTHOR_EMAIL=$GIT_COMMITTER_EMAIL elif [ ! $GIT_COMITTER_EMAIL ]; then export GIT_COMMITTER_EMAIL=$GIT_AUTHOR_NAME fi if [ ! $GIT_AUTHOR_NAME ] [ ! $GIT_COMMITTER_NAME ]; then export GIT_AUTHOR_NAME=nob...@none.org export GIT_COMMITTER_NAME=nob...@none.org elif [ ! $GIT_AUTHOR_NAME ]; then export GIT_AUTHOR_NAME=$GIT_COMMITTER_NAME elif [ ! $GIT_COMITTER_NAME ]; then export GIT_COMMITTER_NAME=$GIT_AUTHOR_NAME fi cp ../../../../scripts/index.empty $GIT_INDEX_FILE.new git ls-files -s | sed s-\t\*-addons/- | grep -e \t*addons/$module | ( export GIT_INDEX_FILE=$GIT_INDEX_FILE.new ; git update-index --index-info ) mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE Now another problem: this leaves behind thousands of now empty merge nodes (--prune-empty doesnt seem to catch them all), so I loop through additional `git filter-branch --prune-empty` runs, until the ref remains unchanged. This process is even more time-consuming, as it takes really many passes (havent counted them yet). Does anyone have an idea, why a single run doesnt catch that all? cu -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
filter-branch IO optimization
Hi folks, for certain projects, I need to regularily run filter-branch on quite large repos (10k commits), and that needs to be run multiple times, which takes several hours, so I'm looking for optimizations. The main goal of this filtering is splitting out many modules from a large upstream repo into their own downstream repos. This process should be fully deterministic (IOW: running it twice at the same input, should produce exactly same output, so commit IDs stay the same after subsequent runs) My current approach is most likely yet a bit too naive: #1: forkoff new branch from current upstream #2: run a tree-filter which: * removes all files not belonging to the wanted module * move the module directory under another subdir (./addons/) * fix author/comitter name/email if empty (because otherwise fails) * fix charater sets and indentions of source files #3: loop through `git filter-branch --prune-empty` to get rid of empty merge nodes (which otherwise remain really a lot), until branch remains unchanged #4: run plain rebase onto initial commit to linearize the history All that is done is on per-module basis (for now only about 10, but soon can become much more). One thing I haven't tried yet is using the -d option to move the .git-rewrite dir to an tmpfs (have to clarify some operating considerations first) ;-o The next step I have in mind is using --subdirectory-filter, but open questsions are: * does it suffer from the same problems w/ empty username/email like --tree-filter ? ** if yes: what can I do about it (have an additional pass for fixing that before running the --tree-filter ? * can I somehow teach the --subdirectory filter to place the result under some somedir instead of directly to root ? * can I use --tree-filter in combination with --subdireectory-filter ? which one is executed first ? thanks -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How can I tell if anything was fetched?
I think you'd only need to record the state of all refs (eg. the output of `git for-each-ref') to reliably detect any changes. I would just record the output of `git ls-remote . | sort -u` somewhere and compare it next time (maybe you even wanna grep for the desired ref namespaces). cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A basic question
1) Does git have a built-in way to get a list of all of the most recently committed files only at a given point in time, thus automatically recording the revisions of all of the component files of a release? There is no concept of per-file revisions in git. But you can check which ones are changed in multiple ways, eg: * per commit, commit-range or per-branch level - see git-log manpage * between arbitratry commints - see git-diff manpage This implies that for files which are being modified or which have been staged but not committed, that git would go back to find the predecessor file which had been committed. Forget about the staging issue (index) at this point - it's just existing in the _local_ clone (eg of some individual developer), for your usecase you're only interested in what's actually committed to certain branch(es). cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: The GitTogether
Hi, Also, there are many academic institutions, and at least some might be happy to host an event like this (I'm thinking especially of the Zuse institute). I had offered to help with hooking up with them several weeks ago already. So, it's really just a matter of communication. Yep. Another option could be Fraunhofer (we've got connections there), or maybe Office-2.0 or Tempelhof Airport. I've already triggered our business guys, they're quite interested. cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Failing svn imports from apache.org
Does anyone have an idea, what might be wrong here / how to fix it ? Here: git svn --version git-svn version 1.7.12.592.g41e7905 (svn 1.6.18) What's yours? 1.7.9.5 (ubuntu precise) I'm getting Initialized empty Git repository in /tmp/discovery/.git/ Using higher level of URL: http://svn.apache.org/repos/asf/commons/proper/discovery = http://svn.apache.org/repos/asf W: Ignoring error from SVN, path probably does not exist: (160013): Dateisystem hat keinen Eintrag: File not found: revision 100, path '/commons/proper/discovery' W: Do not be alarmed at the above message git-svn is just searching aggressively for old history. This may take a while on large repositories and then it checks the revisions. I didn't want to wait for r1301705... Does your git svn abort earlier or after checking all revs? It also scanned through thousands of revisions and then failed: W: Do not be alarmed at the above message git-svn is just searching aggressively for old history. This may take a while on large repositories mkdir .git: No such file or directory at /usr/lib/git-core/git-svn line 3669 cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Failing svn imports from apache.org
Hi folks, I'm currently trying to import apache.org svn server, without success. See: git@moonshine:~/projects/common/libs$ git svn clone --stdlayout http://svn.apache.org/repos/asf/commons/proper/discovery/ Initialized empty Git repository in /home/git/projects/common/libs/discovery/.git/ W: Ignoring error from SVN, path probably does not exist: (160013): Filesystem has no item: '/repos/asf/!svn/bc/100/commons/proper/discovery' path not found W: Do not be alarmed at the above message git-svn is just searching aggressively for old history. This may take a while on large repositories mkdir .git: No such file or directory at /usr/lib/git-core/git-svn line 3669 Does anyone have an idea, what might be wrong here / how to fix it ? thx -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Encrypted repositories
Well, everybody can access the objects, but they're encrypted, so you need the repo key (which, of course isn't contained in the repo itself ;-p) to decrypt them. So, in short, blobs are not encrypted with the hash of their contents as encryption keys at all. No, the blobs are encrypted with their content hash as key, and the encrypted blob will be stored with it's content hash as object id. For the usecases I have in mind (backups, filesharing, etc) this wouldn't hurt so much, if the objects are compressed before encryption. For that kind of usage pattern, you are better off looking at encrypted tarballs or zip archives. No, that doesn't give us anything like history, incremental synchronization, etc, etc. What I finnaly wanna has is a usual git, just with encryption, but I can live with loosing differential compression. cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Encrypted repositories
Hi, Enrico Weigelt enrico.weig...@vnc.biz writes: * blobs are encrypted with their (original) content hash as encryption keys What does this even mean? Is it expected that anybody who has access to the repository can learn names of objects (e.g. by running ls .git/objects/??/)? If so, from whom are you protecting your repository? Well, everybody can access the objects, but they're encrypted, so you need the repo key (which, of course isn't contained in the repo itself ;-p) to decrypt them. The whole tree will still be consistent even without encryption support (so, gc etc shouldn't break), but to actually _use_ the repo (eg. checkout or adding new commits), you'll need the encryption support and the repo key (well, committing should theoretically even work with diffrent repo key, even this doesn't make much sense ;-)). How does this encryption interact with delta compression employed in pack generation? Probably not at all ;-o For the usecases I have in mind (backups, filesharing, etc) this wouldn't hurt so much, if the objects are compressed before encryption. cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: splitted directory objects
Hi, IIRC some people were working on splitted directory objects (eg. for putting subdirs into their own objects), some time ago. Each directory maps to its own tree object, so a subdirectory is stored in its own object. It happened on April 7th, 2005, if not earlier. Ah, great :) Maybe I've mixed that up with the discussion about splitting large files into several objects. What's the status on this ? cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Encrypted repositories
Hi, I'm currently planning to implement an strong encryption in git (not like gitcrypt, but with encrypted blobs, directories, etc, directly in the core). The idea goes like this: * blobs are encrypted with their (original) content hash as encryption keys * directory objects only hold randomized filenames and pointers to the encrypted blob (content hash of the encrypted data) * new ext-directory objects are holding a mapping of the randomized file names to the real ones and the encryption keys, stored encrypted similar to the blobs * ext-directory object is referenced by a special filename in the directory object. * commit objects also hold an encrypted section (eg. uuencoded) with the ext-directory node's key, additional commit text, etc, itself encrypted with the repository key This way, the lowlevel / bare repository operations (including remote sync and gc) should continue to work, while only actual access (eg. checkout or commit) need to be changed and have the repository key available. What do you think about this approach ? cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
splitted directory objects
Hi folks, IIRC some people were working on splitted directory objects (eg. for putting subdirs into their own objects), some time ago. What's the current status of this ? cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: diff/merge tool that ignores whitespace changes
Hi, Would that help ? git help diff [snip] --ignore-space-at-eol Ignore changes in whitespace at EOL. -b, --ignore-space-change Ignore changes in amount of whitespace. This ignores whitespace at line end, and considers all other sequences of one or more whitespace characters to be equivalent. -w, --ignore-all-space Ignore whitespace when comparing lines. This ignores differences even if one line has whitespace where the other line has none. That might be it :) Now I yet need to find out how to configure tig for it. By the way: anything similar for merge/rebase ? thx -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: diff/merge tool that ignores whitespace changes
snip Thanks folks, but that doesn't solve my problem. I'm looking for something that's usable on command line or in scripts. Usecase a) * git-diff or git-format-patch or tig should not show differences that are only whitespace changes (eg. differing linefeeds or tabs vs. spaces, changed indentions, etc) Usecase b) * when doing merges or rebases, changes in whitespaces only should be either ignored or resolved fully automtically. * For example: - A changes spaces into tabs or adds leading/trailing spaces - B changes some non-spaces cu -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff/merge tool that ignores whitespace changes
Hi folks, I'm looking for a diff / merge tool that treats lines with only whitespace changes (trailing or leading whitespaces, linefeeds, etc) as equal. The goal is to make reviews as well as merging or rebasing easier when things like indentions often change. Does anybody know an solution for that ? thx -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html