Re: Hash algorithm analysis
On 07/23/2018 06:10 PM, demerphq wrote: > On Sun, 22 Jul 2018 at 01:59, brian m. carlson > wrote: >> I will admit that I don't love making this decision by myself, because >> right now, whatever I pick, somebody is going to be unhappy. I want to >> state, unambiguously, that I'm trying to make a decision that is in the >> interests of the Git Project, the community, and our users. >> >> I'm happy to wait a few more days to see if a consensus develops; if so, >> I'll follow it. If we haven't come to one by, say, Wednesday, I'll make >> a decision and write my patches accordingly. The community is free, as >> always, to reject my patches if taking them is not in the interest of >> the project. > > Hi Brian. > > I do not envy you this decision. > > Personally I would aim towards pushing this decision out to the git > user base and facilitating things so we can choose whatever hash > function (and config) we wish, including ones not invented yet. > > Failing that I would aim towards a hashing strategy which has the most > flexibility. Keccak for instance has the interesting property that its > security level is tunable, and that it can produce aribitrarily long > hashes. Leaving aside other concerns raised elsewhere in this thread, > these two features alone seem to make it a superior choice for an > initial implementation. You can find bugs by selecting unusual hash > sizes, including very long ones, and you can provide ways to tune the > function to peoples security and speed preferences. Someone really > paranoid can specify an unusually large round count and a very long > hash. > > Also frankly I keep thinking that the ability to arbitrarily extend > the hash size has to be useful /somewhere/ in git. I would not suggest arbitrarily long hashes. Not only would it complicate a lot of code, it is not clear that it has any real benefit. Plus, the code contortions required to support arbitrarily long hashes would be more susceptible to potential bugs and exploits, simply by being more complex code. Why take chances? I would suggest (a) hash size of 256 bits and (b) choice of any hash function that can produce such a hash. If people feel strongly that 256 bits may also turn out to be too small (really?) then a choice of 256 or 512, but not arbitrary sizes. Sitaram also not a cryptographer!
Re: Use different ssh keys for different github repos (per-url sshCommand)
On 07/19/2018 06:52 PM, Sitaram Chamarty wrote: > On Thu, Jul 19, 2018 at 03:24:54PM +0300, Basin Ilya wrote: >> Hi. >> >> I have two github accounts, one is for my organization and I want git to >> automatically choose the correct ssh `IdentityFile` based on the clone URL: >> >> g...@github.com:other/publicrepo.git >>~/.ssh/id_rsa >> g...@github.com:theorganization/privaterepo.git >>~/.ssh/id_rsa.theorganization >> >> Unfortunately, both URLs have same host name, therefore I can't configure >> this in the ssh client config. I could create a host alias there, but >> sometimes somebody else gives me the github URL and I want it to work out of >> the box. >> >> I thought I could add a per-URL `core` section similar to `user` and `http`, >> but this section is ignored by git (2.18): >> >> [core "g...@github.com:theorganization"] >> sshCommand = /bin/false >> #sshCommand = ssh -i ~/.ssh/id_rsa.theorganization >> >> I thought of writing a wrapper script to deduce the key from the arguments: >> >> g...@github.com git-upload-pack '/theorganization/privaterepo.git' >> >> Is this the only option? > > This is what I do (I don't have two accounts on github, but > elsewhere; same idea though) my apologies; I did not read your email fully and went off half-cocked! Looks like you already tried host aliases and they don't work for you. Sorry for the noise!
Re: Use different ssh keys for different github repos (per-url sshCommand)
On Thu, Jul 19, 2018 at 03:24:54PM +0300, Basin Ilya wrote: > Hi. > > I have two github accounts, one is for my organization and I want git to > automatically choose the correct ssh `IdentityFile` based on the clone URL: > > g...@github.com:other/publicrepo.git >~/.ssh/id_rsa > g...@github.com:theorganization/privaterepo.git >~/.ssh/id_rsa.theorganization > > Unfortunately, both URLs have same host name, therefore I can't configure > this in the ssh client config. I could create a host alias there, but > sometimes somebody else gives me the github URL and I want it to work out of > the box. > > I thought I could add a per-URL `core` section similar to `user` and `http`, > but this section is ignored by git (2.18): > > [core "g...@github.com:theorganization"] > sshCommand = /bin/false > #sshCommand = ssh -i ~/.ssh/id_rsa.theorganization > > I thought of writing a wrapper script to deduce the key from the arguments: > > g...@github.com git-upload-pack '/theorganization/privaterepo.git' > > Is this the only option? This is what I do (I don't have two accounts on github, but elsewhere; same idea though) # this goes in ~/.ssh/config host gh1 usergit hostnamegithub.com identityfile~/.ssh/id_rsa_1 host gh2 usergit hostnamegithub.com identityfile~/.ssh/id_rsa_2 Now use "gh1:username/reponame" and "gh2:username/reponame" as URLs. It all just works.
Re: why does builtin/init-db.c mention "/etc/core-git/templates/hooks/update"?
On Mon, May 28, 2018 at 09:27:18AM -0400, Robert P. J. Day wrote: [snipped the rest because I really don't know] > more to the point, is that actually what the "update" hook does? i > just looked at the shipped sample, "update.sample", and it seems to be > related to tags: > > #!/bin/sh > # > # An example hook script to block unannotated tags from entering. no that's just a sample. An update hook can do pretty much anything, and if it exits with 0 status code, the actual update succeeds. If it exists with any non-zero exit code, the update will fail. This is (usually) the basis for a lot of checks that people may want, from commit message format to access control at the ref (branch/tag) level for write operations.
Re: worktrees vs. alternates
On Wed, May 16, 2018 at 04:02:53PM -0400, Konstantin Ryabitsev wrote: > On 05/16/18 15:37, Jeff King wrote: > > Yes, that's pretty close to what we do at GitHub. Before doing any > > repacking in the mother repo, we actually do the equivalent of: > > > > git fetch --prune ../$id.git +refs/*:refs/remotes/$id/* > > git repack -Adl > > > > from each child to pick up any new objects to de-duplicate (our "mother" > > repos are not real repos at all, but just big shared-object stores). > > Yes, I keep thinking of doing the same, too -- instead of using > torvalds/linux.git for alternates, have an internal repo where objects > from all forks are stored. This conversation may finally give me the > shove I've been needing to poke at this. :) I may have missed a few of the earlier messages, but in the last 20 or so in this thread, I did not see namespaces mentioned by anyone. (I.e., apologies if it was addressed and discarded earlier!) I was under the impression that, as long as "read" access need not be controlled (Konstantin's situation, at least, and maybe Peff's too, for public repos), namespaces are a good way to create and manage that "mother repo". Is that not true anymore? Mind, I have not actually used them in anger anywhere, so I could be missing some really big point here. sitaram signature.asc Description: PGP signature
Re: git repo vs project level authorization
Ken, On Mon, Dec 05, 2016 at 11:04:44PM +0100, Fredrik Gustafsson wrote: > On Mon, Dec 05, 2016 at 03:33:51PM -0500, ken edward wrote: > > I am currently using svn with apache+mod_dav_svn to have a single > > repository with multiple projects. Each of the projects is controlled > > by an access control file that lists the project path and the allowed > > usernames. > > > > Does git have this also? where is the doc? > > > > Ken > > Git does not do hosting or access control. For this you need to use a > third party program. There are plenty of options for you and each has > different features and limitations. For example you should take a look > at gitolite, gitlab, bitbucket, github, gogs. Just to mention a few. > It's also possible to setup git with ssh or http/https with your own > access control methods. See the progit book for details here. For some reason I did not see your email so I am responding to Fredrik's. If your current system is an access control file, gitolite may be the closest "in spirit" to what you have; The others that Fredrik mentioned are all much more GUI (and all of them have additional features like issue tracking, code reiew, etc.) If you need a more github-like experience, try those out. Gitolite does *only* access control, nothing else, but within that limited scope it's pretty powerful. The simplest/quickest overview is probably this: http://gitolite.com/gitolite/overview.html#basic-use-case regards sitaram
Re: [ANN] Pro Git Reedited 2nd Edition
On 08/12/2016 08:07 PM, Jon Forrest wrote: > > > On 8/12/16 6:11 AM, Sitaram Chamarty wrote: > >> At present gitolite is -- AFAIK -- the only "pure server side", "no GUI" >> solution for access control, and has some interesting features that more >> "GUI" solutions may not. It is also (again, AFAIK) used by kernel.org, >> Fedora, Gentoo, and several other open source projects as well as many >> enterprises. >> >> If you're ok with adding that, I'd be happy to supply some text for your >> consideration. > > I appreciate your offer and I don't disagree with what you're > suggesting, but my goal, for now at least, is to keep the same coverage as > upstream. > > If they add something about Gitolite to their next edition, then > I will also. Oh ok; I must have misunderstood what "2nd edition" means, or did not read your original email carefully enough. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ANN] Pro Git Reedited 2nd Edition
On 07/24/2016 09:37 AM, Jon Forrest wrote: > This an announcement of Pro Git Reedited 2nd Edition, which is > a substantial edit of Chacon and Straub's Pro Git 2nd Edition. > I spent a lot of time tightening it up and maybe clearing > up some explanations. > > The pdf is downloadable at: > https://drive.google.com/open?id=0B-Llso12P94-Ujg5Z1dhWUhhMm8 > > The complete self-contained HTML is at: > https://drive.google.com/file/d/0B-Llso12P94-U3g1aDBRWjk1Sk0 > > The sources for this book are at: > https://github.com/nobozo/progit2 > > I welcome comments. Hi While I'm kinda happy to see the chapter on gitolite gone (for reasons of difficulty in keeping it current at that time), I humbly suggest that a brief mention of gitolite somewhere in the chapter on "git on the server" would be useful. At present gitolite is -- AFAIK -- the only "pure server side", "no GUI" solution for access control, and has some interesting features that more "GUI" solutions may not. It is also (again, AFAIK) used by kernel.org, Fedora, Gentoo, and several other open source projects as well as many enterprises. If you're ok with adding that, I'd be happy to supply some text for your consideration. regards sitaram -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Allow git alias to override existing Git commands
On 11/11/15 14:58, Jeremy Morton wrote: > On 11/11/2015 04:48, Sitaram Chamarty wrote: >> A lot of things in Unix do follow that "give you rope to hang yourself" >> philosophy. I used to (and to *some* extent still do) think like that, >> but some years of supporting normal users trying to do stuff has taught >> me it's not always that simple. >> >> I can easily see someone blogging some cool way to do something, and a >> less savvy user uses that in his gitconfig, and gets burned later >> (possibly much later, enough that he does not easily make the >> connection!) > > We're not talking about "normal users" here, that's what Google Chrome > is for. We're talking about Git users using the commandline client. > They ought to know what they're doing and if they don't, they're > screwed anyway because there are quite a few gotchas with Git. I can only repeat what I said before: it's not all black and white. Reducing the opportunity to make mistakes is useful for everyone, even expetrs. Especially stuff that you may have setup aeons ago and hits you only aeons later when something (supposedly unrelated) somewhere else changes and you didn't remember and you tear your hair out. It happens to everyone. The only experts I know who have never torn their hair out over something silly they forgot (could be anything) are the ones who were already bald :) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Allow git alias to override existing Git commands
On 11/11/15 15:42, Jeremy Morton wrote: > On 11/11/2015 09:51, Sitaram Chamarty wrote: >> I can only repeat what I said before: it's not all black and white. >> >> Reducing the opportunity to make mistakes is useful for everyone, even >> expetrs. Especially stuff that you may have setup aeons ago and hits >> you only aeons later when something (supposedly unrelated) somewhere >> else changes and you didn't remember and you tear your hair out. > > Not when it reduces useful functionality for experts, it's not. Speaking of... did you try the script I sent in an earlier mail? Putting it in /usr/local/bin (on Fedora. YMMV) seems to work fine, since that appears earlier than /bin where the real git lives. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Allow git alias to override existing Git commands
On 11/11/15 01:34, Jeremy Morton wrote: > On 10/11/2015 18:12, Stefan Beller wrote: >> On Tue, Nov 10, 2015 at 8:31 AM, Jeremy Mortonwrote: >>> It's recently come to my attention that the "git alias" config functionality >>> ignores all aliases that would override existing Git commands. This seems >>> like a bad idea to me. >> >> This ensures that the plumbing commands always work as expected. >> As scripts *should* only use plumbing commands, the scripts should >> work with high probability despite all the crazy user configuration/aliases. >> > > I just disagree with this. If a user chooses to override their Git > commands, it's their problem. Why should Git care about this? It > should provide the user with the option to do this, and if the user > ruins scripts because of their aliases, it is not Git's problem. What > you are doing is taking away power from users to use git aliases to > their full potential. A lot of things in Unix do follow that "give you rope to hang yourself" philosophy. I used to (and to *some* extent still do) think like that, but some years of supporting normal users trying to do stuff has taught me it's not always that simple. I can easily see someone blogging some cool way to do something, and a less savvy user uses that in his gitconfig, and gets burned later (possibly much later, enough that he does not easily make the connection!) So for the record, I am definitely against this kind of change. But if I were in your place, and really *needed* this, here's what I would do: #!/bin/bash # this file is named 'git' and placed in a directory that is earlier in $PATH # than the real 'git' binary (typically $HOME/bin). This allows you to # override git sub-commands by adding stuff like this to your ~/.gitconfig # (notice the "o-" prefix): # # [alias] # o-clone = clone --recursive GIT=/bin/git# the real 'git' binary cmd="$1" shift if $GIT config --get alias.o-$cmd >/dev/null then $GIT o-$cmd "$@" else $GIT $cmd "$@" fi -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Where to report security vulnerabilities in git?
On 08/22/2015 04:25 AM, Guido Vranken wrote: List, I would like to report security vulnerabilities in git. Due to the sensitive nature of security-impacting bugs I would like to know if there's a dedicated e-mail address for this, so that the issues at play can be patched prior to a coordinated public disclosure of the germane exploitation details. I did find an older thread in the archive addressing this question ( http://thread.gmane.org/gmane.comp.version-control.git/260328/ ), but because I'm unsure if those e-mail addresses are still relevant, I'm asking again. If it has anything to do with remote access (via ssh or http) please copy me also. I wrote/write/maintain gitolite, which is a reasonably successful access control system for git servers. regards sitaram signature.asc Description: OpenPGP digital signature
Re: git name-rev not accepting abbreviated SHA with --stdin
On 07/03/2015 11:06 PM, Junio C Hamano wrote: Sitaram Chamarty sitar...@gmail.com writes: On 06/25/2015 05:41 AM, Junio C Hamano wrote: Sitaram Chamarty sitar...@gmail.com writes: This *is* documented, but I'm curious why this distinction is made. I think it is from mere laziness, and also in a smaller degree coming from an expectation that --stdin would be fed by another script like rev-list where feeding full 40-hex is less work than feeding unique abbreviated prefix. Makes sense; thanks. Maybe if I feel really adventurous I will, one day, look at the code :-) Sorry, but I suspect this is not 100% laziness; it is meant to read text that has object names sprinkled in and output text with object names substituted. I suspect that this was done to prevent a short string that may look like an object name like deadbabe from getting converted into an unrelated commit object name. As a perl programmer, laziness is much more palatable to me as a reason ;-) Jokes apart, I'm not sure the chances of *both* those things happening -- an accidental hash-like string in the text *and* it matching an existing hash -- are high enough to bother. If it can be done without too much code, it probably should. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git name-rev not accepting abbreviated SHA with --stdin
On 06/25/2015 05:41 AM, Junio C Hamano wrote: Sitaram Chamarty sitar...@gmail.com writes: This *is* documented, but I'm curious why this distinction is made. I think it is from mere laziness, and also in a smaller degree coming from an expectation that --stdin would be fed by another script like rev-list where feeding full 40-hex is less work than feeding unique abbreviated prefix. Makes sense; thanks. Maybe if I feel really adventurous I will, one day, look at the code :-) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
git name-rev not accepting abbreviated SHA with --stdin
Hi all, git name-rev does not accept abbreviated SHAs if --stdin is used, though it works when the SHA is given directly on the command line: $ git version git version 2.4.3 $ git name-rev --tags d73f544 d73f544 tags/v3.6.3~29 $ git name-rev --tags --stdin d73f544 d73f544 This *is* documented, but I'm curious why this distinction is made. Is it merely a matter of parsing or were there some other complications I am unaware of, which forced this distinction to be made? thanks sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: On undoing a forced push
On 06/09/2015 05:42 PM, Duy Nguyen wrote: From a thread on Hacker News. It seems that if a user does not have access to the remote's reflog and accidentally forces a push to a ref, how does he recover it? In order to force push again to revert it back, he would need to know the remote's old SHA-1. Local reflog does not help because remote refs are not updated during a push. This patch prints the latest SHA-1 before the forced push in full. He then can do git push remote +old-sha1:ref He does not even need to have the objects that old-sha1 refers to. We could simply push an empty pack and the the remote will happily accept the force, assuming garbage collection has not happened. But that's another and a little more complex patch. If I am not mistaken, we actively prevent people from downloading an unreferenced SHA (such as would happen if you overwrote refs that contained sensitive information like passwords). Wouldn't allowing the kind of push you just described, require negating that protection? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: On undoing a forced push
On 06/09/2015 07:55 PM, Jeff King wrote: On Tue, Jun 09, 2015 at 07:36:20PM +0530, Sitaram Chamarty wrote: This patch prints the latest SHA-1 before the forced push in full. He then can do git push remote +old-sha1:ref He does not even need to have the objects that old-sha1 refers to. We could simply push an empty pack and the the remote will happily accept the force, assuming garbage collection has not happened. But that's another and a little more complex patch. If I am not mistaken, we actively prevent people from downloading an unreferenced SHA (such as would happen if you overwrote refs that contained sensitive information like passwords). Wouldn't allowing the kind of push you just described, require negating that protection? No, this has always worked. If you have write access to a repository, you can fetch anything from it with this trick. Even if we blocked this, there are other ways to leak information. For instance, I can push up objects that are similar to the target object, claim to have the target object, and then hope git will make a delta between my similar object and the target. Iterate on the similar object and you can eventually figure out what is in the target object. aah ok; I must have mis-remembered something. Thanks! -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: GIT for Microsoft Access projects
On 06/08/2015 09:44 PM, Konstantin Khomoutov wrote: On Mon, 8 Jun 2015 9:45:17 -0500 hack...@suddenlink.net wrote: [...] My question is, will GIT work with MS access forms, queries, tables, modules, etc? [...] Git works with files. So in principle it will work with *files* containing your MS access stuff. But Git will consider and treat those files as opaque blobs of data. That is, you will get no fancy diffing like asking Git to graphically More importantly, you won't get any *merging*, which means you need to be careful about two developers making changes to the same file. This is the only situation where locking (a feature that is inherently at odds with the idea of a *distributed* VCS) is useful. (or otherwise) show you what exact changes have been made to a particular form or query between versions X and Y of a given MS access document -- all it will be able to show you is commit messages describing those changes. So... If you're fine with this setting, Git will work for you, but if not, it won't. One last note: are you really sure you want an SCM/VCS tool to manage your files and not a document management system (DMS) instead? I mean stuff like Alfresco (free software by the way) and the like. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git Server Repository Security?
On 05/18/2015 04:28 PM, John McIntyre wrote: 2015-05-18 11:26 GMT+01:00 Heiko Voigt hvo...@hvoigt.net: If you want a simple tool using ssh-keys have a look at gitolite[1]. It quite simple to setup and with it you can specify all kinds of access rights. That's adding a separate level of complexity. I looked into filesystem-level permissions. I don't see any means of doing so, because everyone accesses the repositories using the 'git' user. So even if I add a group like 'devClient1' and then change the group ownership of a repo to that user, they'll still be able to access all repos..? My usual answer to this is http://gitolite.com/gitolite/overview.html#basic-use-case The first example is doable with file system permissions if you give everyone a separate userid, but it's a nightmare. The second one is not even possible. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: resume downloads
On 05/11/2015 03:49 AM, Junio C Hamano wrote: The current thinking is to model this after the repo tool. Prepare a reasonably up-to-date bundle file on the server side, shameless plug (but not commercial) For people using gitolite, the server side issues of generating a reasonably up-to-date bundle *and* enabling it for resumable download using rsync (with the same ssh key used to gain gitolite access), can all be handled by gitolite. /shameless plug Of course the client side issues still remain; gitolite can't help there. add a protocol capability to advertise the URL to download that bundle from upload-pack, and have git clone to pay attention to it. Then, a git clone could become: - If the capability advertises such a prebuilt bundle, spawn curl or wget internally to fetch it. This can be resumed when the connection goes down and will grab majority of the data necessary. - Extract the bundle into temporary area inside .git/refs/ to help the next step. - Internally do a git fetch to the original server. Thanks to the bundle transfer that has already happened, this step will become a small incremental update. - Then prune away the temporary .git/refs/ refs that were in the bundle, as these are not the up-to-date refs that exist on the server side. A few points that need to be considered by whoever is doing this are: - Where to download the bundle, so that after killing git clone that is still in the bundle-download phase, the next invocation of git clone can notice and resume the bundle-download; - What kind of transfer protocols do we want to support? Is http and https from CDN sufficient? In other words, what exactly should the new capability say to point at the prebuilt bundle? These (and probably there are several others) are not something that repo does not have to worry about, but would become issues when we try to fold this into git clone. On Sun, May 10, 2015 at 2:55 PM, Thiago Farina tfrans...@gmail.com wrote: Hi, Is there links to discussion on this? I mean, is resume downloads a feature that is still being considered? Being able to download huge repos like WebKit, Linux, LibreOffice in small parts seems like a good feature to me. -- Thiago Farina -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to send a warning message from git hosting server?
On 04/12/2015 04:55 PM, Yi, EungJun wrote: On Wed, Apr 8, 2015 at 8:08 PM, Tony Finch d...@dotat.at wrote: Yi, EungJun semtlen...@gmail.com wrote: I want a way to response a remote message when a client send any kind of request. Is it possible? Yes, though you need a wrapper around git. Recent versions of gitolite have a motd message of the day feature. It sounds nice. Is the wrapper for git client or git server? Gitolite is -- in this context -- a wrapper on the git server. It's main purpose is access control; the motd feature is just an extra that just happened to be easy once there was a wrapper anyway. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: An interesting opinion on DVCS/git
On 03/04/2015 08:55 PM, Michael J Gruber wrote: Yes, that article has a few really weak lines of arguments, such as the tutorial count. Here's his definition of the main draw of a DVCS: No, the only thing that a DVCS gets you, by definition, is that everyone gets a copy of the full offline history of the entire repository to do with as you please. That completely misses the point. What about committing while offline, 'git blame' months-old changes offline, or local branches that don't have to make it to the server until they have cooked for a while, and so on and on? We're not all facebooks with multi-GB repos, and I certainly don't care as much about disk space or bandwidth if losing those features is the cost. It gets worse: Let me tell you something. Of all the time I have ever used DVCSes, over the last twenty years if we count Smalltalk changesets and twelve or so if you don’t, I have wanted to have the full history while offline a grand total of maybe about six times. I don't know how you can work on anything reasonably complex and multi-developer without using some of those features six times in a *week* (sometimes, six times in a *weekend*) let alone 12 years. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: saving git push --signed certificate blobs
On 12/30/2014 11:18 PM, Junio C Hamano wrote: Sitaram Chamarty sitar...@gmail.com writes: Just wanted to say there's a little script at [1] that saves the certificate blobs generated on the server side by git push --signed. Quoting from the source: # Collects the cert blob on push and saves it, then, if a certain number of # signed pushes have been seen, processes all the saved blobs in one go, # adding them to the special ref 'refs/push-certs'. This is done in a way # that allows searching for all the certs pertaining to one specific branch # (thanks to Junio Hamano for this idea plus general brainstorming). Note that although I posted it in the gitolite ML, this has very little to do with gitolite. Any git server can use it, with only one very minor change [2] needed. sitaram [1]: https://groups.google.com/forum/#!topic/gitolite/7cSrU6JorEY [2]: Either set the GL_OPTIONS_GPC_PENDING environment variable by reading its value from 'git config', or replace the only line that uses that variable, with some other test. Nicely done. We'd need to give you a tool to make it easy to create a validated chain of certificates out of $ git log refs/push-certs -- refs/heads/master to make the history this script creates truly useful, but I think it is a very good start. I can see that you tried to make the log output human readable by reformatting $cf, I am not sure if it gives us much value. I would have expected that you would just use the blob contents for the log message as-is, so that $ git log --pretty=raw refs/push-certs -- refs/heads/master | validate-cert-chain can just work on blobs (shown in the log output) without having to extract the blobs by doing something like $ git rev-list refs/push-certs -- refs/heads/master | while read commit do git cat-file blob $commit:refs/heads/master | validate-cert done I see what you mean. And it looks like using --format=%B also works pretty well. Will fix. By the way, you seem to like cat too much, though. You don't have to cat a single file into a pipeline. Gee I hope Randal Schwartz is not on this list :) Anyway the previous fix also removes most of them. I'm attaching the current version so non-gitolite users can find it without having to go to the gitolite repo. For gitolite users, it's somewhere in contrib/ in the source tree. sitaram Thanks. #!/bin/sh # -- # post-receive hook to adopt push certs into 'refs/push-certs' # Collects the cert blob on push and saves it, then, if a certain number of # signed pushes have been seen, processes all the saved blobs in one go, # adding them to the special ref 'refs/push-certs'. This is done in a way # that allows searching for all the certs pertaining to one specific branch # (thanks to Junio Hamano for this idea plus general brainstorming). # The collection happens only if $GIT_PUSH_CERT_NONCE_STATUS = OK; again, # thanks to Junio for pointing this out; see [1] # # [1]: https://groups.google.com/forum/#!topic/gitolite/7cSrU6JorEY # WARNINGS: # Does not check that GIT_PUSH_CERT_STATUS = G. If you want to check that # and FAIL the push, you'll have to write a simple pre-receive hook # (post-receive is not the place for that; see 'man githooks'). # # Gitolite users: failing the hook cannot be done as a VREF because git does # not set those environment variables in the update hook. You'll have to # write a trivial pre-receive hook and add that in. # Relevant gitolite doc links: # repo-specific environment variables # http://gitolite.com/gitolite/dev-notes.html#rsev # repo-specific hooks # http://gitolite.com/gitolite/non-core.html#rsh # http://gitolite.com/gitolite/cookbook.html#v3.6-variation-repo-specific-hooks # Environment: # GIT_PUSH_CERT_NONCE_STATUS should be OK (as mentioned above) # # GL_OPTIONS_GPC_PENDING (optional; defaults to 1). This is the number of # git push certs that should be waiting in order to trigger the post # processing. You can set it within gitolite like so: # # repo foo bar# or maybe just 'repo @all' # option ENV.GPC_PENDING = 5 # Setup: # Set up this code as a post-receive hook for whatever repos you need to. # Then arrange to have the environment variable GL_OPTION_GPC_PENDING set to # some number, as shown above. (This is only required if you need it to be # greater than 1.) It could of course be different for different repos. # Also see Invocation section below. # Invocation: # Normally via git (see 'man githooks'), once it is setup as a post-receive # hook. # # However, if you set the pending limit high, and want to periodically # clean up pending certs without necessarily waiting for the counter to # trip, do the following (untested): # # RB=$(gitolite query-rc GL_REPO_BASE
saving git push --signed certificate blobs
Hello, Just wanted to say there's a little script at [1] that saves the certificate blobs generated on the server side by git push --signed. Quoting from the source: # Collects the cert blob on push and saves it, then, if a certain number of # signed pushes have been seen, processes all the saved blobs in one go, # adding them to the special ref 'refs/push-certs'. This is done in a way # that allows searching for all the certs pertaining to one specific branch # (thanks to Junio Hamano for this idea plus general brainstorming). Note that although I posted it in the gitolite ML, this has very little to do with gitolite. Any git server can use it, with only one very minor change [2] needed. sitaram [1]: https://groups.google.com/forum/#!topic/gitolite/7cSrU6JorEY [2]: Either set the GL_OPTIONS_GPC_PENDING environment variable by reading its value from 'git config', or replace the only line that uses that variable, with some other test. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
GIT_PUSH_CERT* env vars and update/post-update hooks...
Hi git core devs, Any chance I could persuade you to set the GIT_PUSH_CERT* environment variables for the update (and post-update) hooks also? Background: gitolite takes over the update hook [1] for authorisation and enforcement, and I want to avoid taking over the pre-receive hook also in order to do this check. The post-update is not so important; gitolite doesn't use it anyway, so if I have to take over one of them, I may as well take over post-receive. I just added that for consistency. thanks sitaram [1]: because it's nice to *selectively* reject refs when more than one ref is pushed at the same time; pre-receive is all or none. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Where is the best place to report a security vulnerability in git?
On 11/27/2014 06:50 AM, Jonathan Nieder wrote: Hi Hugh, Hugh Davenport wrote: Where is the best place to report a security vulnerability in git? Current practice is to contact Junio C Hamano gits...@pobox.com. Cc-ing Jeff King p...@peff.net isn't a bad idea while at it. We should probably set up a mailing list to make this more obvious, but that's what we have today. Hi Hugh, I maintain a somewhat widely used access control program for remote access to git, so I'm interested also. Gitolite [1] and similar systems provide access control for git repos. There's a very good chance that something which is not a concern for local use, could become an attack vector if enabled through gitolite. Hence my interest, and my request that I be copied. Jonathan/Junio/Jeff: if such a mailing list does happen please consider adding me into it. regards sitaram [1]: https://gitolite.com -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[slightly OT?] TOTP gateway for any service on any server
Hi all, I've just created a general purpose TOTP gatekeeper that is designed to gate access to any service on any server/OS (as long as traffic can only go *through* the TOTP gatekeeper). The inspiration was Konstantin Ryabitsev's implementation of two-factor authentication for kernel.org -- from which I got the idea of use TOTP to whitelist an IP for some time. I then extended it to protect any TCP port on any server behind the gatekeeper. http://gitolite.com/totport/ is the documentation, and the source is linked there. I'd welcome any feedback but please be mindful of the fact that deep discussion may veer way off-topic for the git or gitolite mailing lists, although I hope I won't get flak for *this* email :-) sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git feature-branch
On 07/25/2014 11:10 AM, Sheldon Els wrote: It is just a shell script yes, and a man page to make things a bit more discoverable for git help feature-branch. The brew command comes from homebrew, a popular OSX package manager. My platform of choice. You might want to at least add these instructions are for people using macs. Otherwise it seems like you assume everyone is using macs, and nothing else exists in the world as far as you are concerned. Perhaps I can get support for an easy install for your platform. Do When I said more generic I meant it's just *one* shell script; put it somewhere on your $PATH. That should be sufficient for something like this (at the risk of going a bit off-topic for the list). you think a Makefile that installs to /usr/local/bin and /usr/local/share/man would fit, or are you on windows? Ouch. That hurt. On 25 July 2014 05:11, Sitaram Chamarty sitar...@gmail.com wrote: On 07/25/2014 03:45 AM, Sheldon Els wrote: Hi A small tool I wrote that is useful for some workflows. I thought it'd be worth sharing. https://github.com/sheldon/git-feature-branch/ As far as I can tell it's just a shell script; does it really need installation instructions, and if so can they not be more generic than brew install? Speaking for myself I have NO clue what that is. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git feature-branch
On 07/25/2014 03:45 AM, Sheldon Els wrote: Hi A small tool I wrote that is useful for some workflows. I thought it'd be worth sharing. https://github.com/sheldon/git-feature-branch/ As far as I can tell it's just a shell script; does it really need installation instructions, and if so can they not be more generic than brew install? Speaking for myself I have NO clue what that is. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: optimising a push by fetching objects from nearby repos
On 05/11/2014 11:34 PM, Junio C Hamano wrote: Sitaram Chamarty sitar...@gmail.com writes: But what I was looking for was validation from git.git folks of the idea of replicating what git clone -l does, for an *existing* repo. For example, I'm assuming that bringing in only the objects -- without any of the refs pointing to them, making them all dangling objects -- will still allow the optimisation to occur (i.e., git will still say oh yeah I have these objects, even if they're dangling so I won't ask for them from the pusher and not oh these are dangling objects; so I don't recognise them from this perspective -- you'll have to send me those again). So here is an educated guess by a git.git folk. I haven't read the codepath for some time, so I may be missing some details: - The set of objects sent over the wire in push direction is determined by the receiving end listing what it has to the sending end, and then the sending end excluding what the receiving end told that it already has. - The receiving end tells the sending end what it has by showing the names of its refs and their values. Having otherwise dangling objects in your object store alone will not make them reachable from the refs shown to the sending end. But there is another trick the receiving end employes. - The receiving end also includes the refs and their values that appear in the repository it borrows objects from its alternate repositories, when it tells what objects it already has to the sending end. So what you assumed is not entirely correct---bringing in only the objects will not give you any optimization. But because we infer from the location of the object store (i.e. objects directory) where the refs that point at these borrowed objects exist (i.e. in ../refs relative to that objects directory) in order to make sure that we do not have to say oh these are dangling but we know their history is not broken, we still get the same optimisation. Thanks! Everything makes sense. However, I'm not using the alternates mechanism. Since gitolite has the advantage of allowing me to do something before and something after the git-receive-pack, I'm fetching all the refs into a temporary namespace before, and deleting all of them after. So, just for the duration of the push, the refs do exist, and optimisation (of network traffic) therefore happens. In addition, since I check that the user has read access to the lender repo (and don't do this optimisation if he does not), there is -- by definition -- no security issue, in the sense that he cannot get anything from the lender repo that he could not have got directly. Thanks for all your help again, especially the very clear explanation! regards sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
optimising a push by fetching objects from nearby repos
Hi, Is there a trick to optimising a push by telling the receiver to pick up missing objects from some other repo on its own server, to cut down even more on network traffic? So, hypothetically, git push user@host:repo1 --look-for-objects-in=repo2 I'm aware of the alternates mechanism, but that makes the dependency on the other repo sort-of permanent. I'm looking for a temporary dependence, just for the duration of the push. Naturally, the objects should be brought into the target repo for that to happen, except that this would be doing more from disk and less from the network. My gut says this isn't possible, and I've searched enough to almost be sure, but before I give up, I wanted to ask. thanks sitaram Milki: I'm sure you won't mind the cc, since you know the context :-) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: optimising a push by fetching objects from nearby repos
On 05/11/2014 02:32 AM, Junio C Hamano wrote: Sitaram Chamarty sitar...@gmail.com writes: Is there a trick to optimising a push by telling the receiver to pick up missing objects from some other repo on its own server, to cut down even more on network traffic? So, hypothetically, git push user@host:repo1 --look-for-objects-in=repo2 I'm aware of the alternates mechanism, but that makes the dependency on the other repo sort-of permanent. In the direction of fetching, this may be give a good starting point. http://thread.gmane.org/gmane.comp.version-control.git/243918/focus=245397 That's an interesting thread and it's recent too. However, it's about clone (though the intro email mentions other commands also). I'm specifically interested in push efficiency right now. When you fork someone's repo to your own space, and you push your fork to the same server, it ought to be able to get most of the common objects from disk (specifically, from the repo you forked), and only what extra you did from the network. Clones do have a workaround (clone with --reference, then repack, as you said in that thread), but no such workaround exists for push. In the direction of pushing, theoretically you could: - define a new capability look-for-objects-in to pass the name of the repository from git push to the receive-pack; - have receive-pack temporarily borrow from the named repository (if the policy on the server side allows it), and accept the push; - repack in order to dissociate the receiving repository from the other repository it temporarily borrowed from. which would be the natural inverse of the approach suggested in the Can I borrow just temporarily while cloning? thread. But I haven't thought things through with respect to what else need to be modified to make sure this does not have adverse interaction with simultaneous pushes into the same repository, which would make it harder to solve for receive-pack than for clone/fetch. I'll leave it in your capable hands :-) My C coding days are long gone! I do have a way to do this in gitolite (haven't coded it yet; just thinking). Gitolite lets you specify something to do before git-*-pack runs, and I was planning something like this: terminology: borrow, borrower repo, reference repo borrow = relaxed mode 1. check if the user has read access to the reference repo; skip the rest of this if he doesn't 2. from reference repo's objects, find all directories and mkdir them into borrower's objects directory, then find all files and ln (hardlink) them. This is presumably what clone -l does. This method is close to constant time since we're not copying objects. It has the potential issue that if an object existed in the reference repo that was subsequently *deleted* (say, a commit that contained a password, which was quickly overwritten when discovered), and the attacker knows the SHA, he can get the commit out by sending an commit that depends on it, then fetching it back. (He could do that to the reference repo directly if he had write access, but we'll assume he doesn't, so this *is* a possible attack). borrow = strict mode 1. (same as for relaxed mode) 2. actually *fetch* all refs from the reference repo to the borrower (into, say, 'refs/borrowed'), then delete all those refs so you just have the objects now. Unlike the previous method, this takes time proportional to the delta between borrower and reference, and may load the system a bit, but unless the reference repo is highly volatile, this will settle down. The point is that it cannot be used to get anything that the user doesn't already have access to anyway. I still have to try it, but it sounds like both these would work. I'd appreciate any comments though... regards sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: optimising a push by fetching objects from nearby repos
On 05/11/2014 07:04 AM, Storm-Olsen, Marius wrote: On 5/10/2014 8:04 PM, Sitaram Chamarty wrote: On 05/11/2014 02:32 AM, Junio C Hamano wrote: That's an interesting thread and it's recent too. However, it's about clone (though the intro email mentions other commands also). I'm specifically interested in push efficiency right now. When you fork someone's repo to your own space, and you push your fork to the same server, it ought to be able to get most of the common objects from disk (specifically, from the repo you forked), and only what extra you did from the network. ... I do have a way to do this in gitolite (haven't coded it yet; just thinking). Gitolite lets you specify something to do before git-*-pack runs, and I was planning something like this: And here you're poking the stick at the real solution to your problem. Many of the Git repo managers will neatly set up a server-side repo clone for you, with alternates into the original repo saving both network and disk I/O. Gitolite already has a fork command that does that (though it uses -l, not alternates). I specifically don't want to use alternates, and I also specifically am looking for something that activates on a push -- in the situations I am looking to optimise, the clone already happened. So your work flow would instead be: 1. Fork repo on server 2. Remotely clone your own forked repo I think it's more appropriate to handle this higher level operation within the security context of a git repo manager, rather than directly in git. Yes, because of the read access check in my suggested procedure to handle this. (Otherwise this is as valid as the plan suggested for clone in Junior's email in [1]). [1]: http://thread.gmane.org/gmane.comp.version-control.git/243918/focus=245397 I will certainly be doing this in gitolite. The point of my post was to validate the flow with the *git* experts in case they catch something I missed, not to say this should be done *in* git. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: optimising a push by fetching objects from nearby repos
On 05/11/2014 08:41 AM, Storm-Olsen, Marius wrote: On 5/10/2014 9:10 PM, Sitaram Chamarty wrote: 1. Clone remote repo 2. Hack hack hack 3. Fork repo on server 4. Push changes to your own remote repo is equally efficient. Your suggestions are good for a manual setup where the target repo doesn't already exist. But what I was looking for was validation from git.git folks of the idea of replicating what git clone -l does, for an *existing* repo. For example, I'm assuming that bringing in only the objects -- without any of the refs pointing to them, making them all dangling objects -- will still allow the optimisation to occur (i.e., git will still say oh yeah I have these objects, even if they're dangling so I won't ask for them from the pusher and not oh these are dangling objects; so I don't recognise them from this perspective -- you'll have to send me those again). [1]: for any gitolite-aware folks reading this: this involves mirroring, bringing a new mirror into play, normal repos, wild repos, and on and on... -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: material for git training sessions/presentations
On 05/05/2014 09:48 AM, Chris Packham wrote: Hi, I know there are a few people on this list that do git training in various forms. At $dayjob I've been asked to run a few training sessions in house. The initial audience is SW developers so they are fairly clued up on VCS concepts and most have some experience (although some not positive) with git. Eventually this may also include some QA folks who are writing/maintaining test suites who might be less clued up on VCSes in general. I know if I googled for git tutorials I'll find a bunch and I can probably write a few myself but does anyone have any advice from training sessions they've run about how best to present the subject matter. Particularly to a fairly savy audience who may have developed some bad habits. My plan was to try and have a few PCs/laptops handy and try to make it a little interactive. Also if anyone has any presentations I could use under a CC-BY-SA (or other liberal license) as a basis for any material I produce that would save me starting from scratch. I've written and used the following; the first one is a bit more popular (or at least has been mentioned several times on #git) 1. git concepts simplified: http://gitolite.com/gcs.html 2. a presentation on git: http://gitolite.com/git.html You can use them straight off the web. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recording the current branch on each commit?
On 04/28/2014 02:22 PM, Jeremy Morton wrote: On 28/04/2014 03:30, Sitaram Chamarty wrote: On 04/28/2014 01:03 AM, Johan Herland wrote: Yeah, sure. Author and Date (and Committer, for that matter) is just metadata, and the current branch name is simply just another kind of metadata. All of them are more-or-less free-form text fields, and off no they're not. In strictly controlled environments they form part of the audit record for the source code. Yes they can be faked (explicitly), but -- again in strictly controlled environments -- that can be limited to before it was first pushed. Why these specific headers as part of the audit record, though? Aren't you just arbitrarily defining them as part of the audit record? who did it and when did they do it are a fair bit more central to how did we get here (viz., the SHA1 of the top commit, if you will) than what branch was this commit born in (or similar). Here's an example from somewhere I worked (indirectly) in the late 90s. Nasty bug, easily fixable (a few characters to change). Customer group all p-ed off. Developer has access to the version control server. He changes something on the VC system to appear as if the bug never existed in the version of the code he shipped to whoever. As a result, the bug was deemed to have mysteriously appeared somewhere along the line. It didn't help that parts of the workflow were semi-manual, so he *did* have vague things to point at. I don't believe I can explain that any better or go into details without some risk, so if you don't agree then that's all there is to it. Suffice it to say I am strongly opposed to the idea, but as long as it's optional -- and for the right reasons (see my other email) -- I'd be OK. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recording the current branch on each commit?
On 04/28/2014 01:03 AM, Johan Herland wrote: On Sun, Apr 27, 2014 at 7:38 PM, Jeremy Morton ad...@game-point.net wrote: On 27/04/2014 10:09, Johan Herland wrote: On Sun, Apr 27, 2014 at 1:56 AM, Jeremy Mortonad...@game-point.net wrote: Currently, git records a checksum, author, commit date/time, and commit message with every commit (as get be seen from 'git log'). I think it would be useful if, along with the Author and Date, git recorded the name of the current branch on each commit. This has been discussed multiple times in the past. One example here: http://thread.gmane.org/gmane.comp.version-control.git/229422 I believe the current conclusion (if any) is that encoding such information as a _structural_ part of the commit object is not useful. See the old thread(s) for the actual pro/con arguments. As far as I can tell from that discussion, the general opposition to encoding the branch name as a structural part of the commit object is that, for some people's workflows, it would be unhelpful and/or misleading. Well fair enough then - why don't we make it a setting that is off by default, and can easily be switched on? That way the people for whom tagging the branch name would be useful have a very easy way to switch it on. Obviously, the feature would necessarily have to be optional, simply because Git would have to keep understanding the old commit object format for a LONG time (probably indefinitely), and there's nothing you can do to prevent others from creating old-style commit objects. Which brings us to another big con at this point: The cost of changing the commit object format. One can argue for or against a new commit object format, but the simple truth at this point is that changing the structure of the commit object is expensive. Even if we were all in agreement about the change (and so far we are not), there are multiple Git implementations (libgit2, jgit, dulwich, etc.) that would all have to learn the new commit object, not to mention that bumping core.repositoryformatversion would probably make your git repo incompatible with a huge number of existing deployments for the foreseeable future. Therefore, the most pragmatic and constructive thing to do at this point, is IMHO to work within the confines of the existing commit object structure. I actually believe using commit message trailers like Made-on-branch: frotz in addition to some helpful infrastructure (hooks, templates, git-interpret-trailers, etc.) should get you pretty much exactly what you want. And if this feature turns out to be extremely useful for a lot of users, we can certainly consider changing the commit object format in the future. I know that for the workflows I personally have used in the past, such tagging would be very useful. Quite often I have been looking through the Git log and wondered what feature a commit was part of, because I have feature branches. Just knowing that branch name would be really useful, but the branch has since been deleted... and in the case of a ff-merge (which I thought was recommended in Git if possible), the branch name is completely gone. True. The branch name is - for better or worse - simply not considered very important by Git, and a Git commit is simply not considered (by Git at least) to be part of or otherwise belong to any branch. Instead the commit history/graph is what Git considers important, and the branch names are really just more-or-less ephemeral pointers into that graph. AFAIK, recording the current branch name in commits was not considered to the worth including in Linus' original design, and since then it seems to only have come up a few times on the mailing list. This is quite central to Git's design, and changing it at this point should not be done lightly. IINM, Mercurial does this differently, so that may be a better fit for the workflows where keeping track of branch names is very important. That said, you are of course free to add this information to your own commit messages, by appending something like Made-on-branch: frotz. In a company setting, you can even create a commit message template or (prepare-)commit-msg hook to have this line created automatically for you and your co-workers. You could even append such information retroactively to existing commits with git notes. There is also the current interpret-trailers effort by Christian Couder [1] that should be useful in creating and managing such lines. [1]: http://thread.gmane.org/gmane.comp.version-control.git/245874 Well I guess that's another way of doing it. So, why aren't Author and Date trailers? They don't seem any more fundamental to me than branch name. I mean the only checkin information you really *need* is the checksum, and commit's parents. The Author and Date are just extra pieces of information you might find useful sometimes, right? A bit like some people might find branch checkin name useful sometimes...? Yeah, sure. Author and Date (and Committer, for that
Re: Trust issues with hooks and config files
On 03/09/2014 10:57 PM, Julian Brost wrote: On 07.03.2014 22:04, Jeff King wrote: Yes, this is a well-known issue. The only safe operation on a repository for which somebody else controls hooks and config is to fetch from it (upload-pack on the remote repository does not respect any dangerous config or hooks). I'm a little bit surprised that you and some other people I asked see this as such a low-priority problem as this makes social engineering attacks on multi-user systems, like they are common at universities, really easy (this is also how I noticed the problem). This, and a lot more control related issues, are solved by tools like gitolite (which I maintain) and Gerrit (from Google), and also many GUI based access control tools like gitlab, gitorious, etc. In these schemes the user does not have a unix account on the server, and any hooks that run will run as the hosting user. Access is either via ssh pub keys (most commonly) or http auth. It is my belief that most multi-user systems have installed one of these systems, and therefore the situation you speak of does not arise. They probably didn't install them to solve *this* problem, but to keep some semblance of control over who can access what repo, but as a side-effect they solve this problem also. sitaram It has been discussed, but nobody has produced patches. I think that nobody is really interested in doing so because: 1. It introduces complications into previously-working setups (e.g., a daemon running as nobody serving repos owned by a git user needs to mark git as trusted). 2. In most cases, cross-server boundaries provide sufficient insulation (e.g., I might not push to an evil person's repo, but rather to a shared repo whose hooks and config are controlled by root on the remote server). If you want to work on it, I think it's an interesting area. But any development would need to think about the transition plan for existing sites that will be broken. I can understand the problem with backward compatibility but in my opinion the default behavior should definitely be to ignore untrusted config files and hooks as it would otherwise only protect users that are already aware of the issue anyways and manually enable this option. Are there any plans for some major release in the future that would allow introducing backward incompatible changes? I would definitely spend some time working on a patch but so far I have no idea of git's internals and never looked at the source before. At the very least, the current trust model could stand to be documented much better (I do not think the rule of fetching is safe, everything else is not is mentioned anywhere explicitly). Good point but not enough in my opinion as I haven't read every git manpage before running it for the first time. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html signature.asc Description: OpenPGP digital signature
Re: [ANNOUNCE] Git v1.9-rc0
On 01/28/2014 05:58 PM, Kacper Kornet wrote: On Mon, Jan 27, 2014 at 10:58:29AM -0800, Jonathan Nieder wrote: Hi, Kacper Kornet wrote: The change in release numbering also breaks down gitolite v2 setups. One of the gitolite commands, gl-compile-conf, expects the output of git --version to match /git version (\d+)\.(\d+)\.(\d+)/. I have no idea how big problem it is, as I don't know how many people hasn't migrate to gitolite v3 yet. http://qa.debian.org/popcon.php?package=gitolite says there are some. I guess soon we'll see if there are complaints. http://gitolite.com/gitolite/migr.html says gitolite v2 is still maintained. Hopefully the patch to gitolite v2 to fix this would not be too invasive --- e.g., how about this patch (untested)? Thanks, Jonathan diff --git i/src/gl-compile-conf w/src/gl-compile-conf index f497ae5..8508313 100755 --- i/src/gl-compile-conf +++ w/src/gl-compile-conf @@ -394,8 +394,9 @@ die the server. If it is not, please edit ~/.gitolite.rc on the server and set the \$GIT_PATH variable to the correct value\n unless $git_version; -my ($gv_maj, $gv_min, $gv_patchrel) = ($git_version =~ m/git version (\d+)\.(\d+)\.(\d+)/); +my ($gv_maj, $gv_min, $gv_patchrel) = ($git_version =~ m/git version (\d+)\.(\d+)\.([^.-]*)/); die $ABRT I can't understand $git_version\n unless ($gv_maj = 1); +$gv_patchrel = 0 unless ($gv_patchrel =~ m/^\d+$/); $git_version = $gv_maj*1 + $gv_min*100 + $gv_patchrel; # now it's normalised die \n\t\t* AAARGH! *\n . It works for 1.9.rc1 but I think it will fail with final 1.9. The following version should be ok: diff --git a/src/gl-compile-conf b/src/gl-compile-conf index f497ae5..c391468 100755 --- a/src/gl-compile-conf +++ b/src/gl-compile-conf @@ -394,7 +394,7 @@ die the server. If it is not, please edit ~/.gitolite.rc on the server and set the \$GIT_PATH variable to the correct value\n unless $git_version; -my ($gv_maj, $gv_min, $gv_patchrel) = ($git_version =~ m/git version (\d+)\.(\d+)\.(\d+)/); +my ($gv_maj, $gv_min, undef, $gv_patchrel) = ($git_version =~ m/git version (\d+)\.(\d+)(\.(\d+))?/); die $ABRT I can't understand $git_version\n unless ($gv_maj = 1); $git_version = $gv_maj*1 + $gv_min*100 + $gv_patchrel; # now it's normalised Gitolite v3 will be 2 years old in a month or so. I would prefer to use this as an opportunity to encourage people to upgrade; v2 really has nothing going for it now. People who cannot upgrade gitolite should simply cut that whole block of code and throw it out. Distro's should probably do that if they are still keeping gitolite v2 alive, because it is clearly not needed if the same distro is up to 1.9 of git! My policy has been to continue support for critical bugs. A bug that can be fixed by simply deleting the offending code block, with no harm done, is -- to my mind -- not critical enough :-) side note, Kacper: I'd use non-capturing parens than an undef in the destination list. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Access different NAMESPACE of remote repo from client side
On 11/16/2013 01:30 PM, Jiang Xin wrote: 2013/11/15 Sitaram Chamarty sitar...@gmail.com: On 11/15/2013 07:55 PM, Sitaram Chamarty wrote: [snip] I should add that the Gitolite model is: the user doesn't need to know about namespaces, because namespaces are just things that the server admin is setting up for his own reasons... I want to say something that git-namespace is sometimes significant for normal user not only for admin. Sure. I only meant in the model that I wrote that branch for. But consider a slight change of syntax: repo dev/CREATOR/..* C = @team RW+ = CREATOR R = @all option namespace.pattern-1 = dev/%/%/NS/* is @1/@3 in @2 Let's say linux and git are parent repos already created (maybe earlier in the conf). This conf will let you use URLs like dev/u1/git/NS/bar (becomes namespace u1/bar in git) dev/u2/git/NS/baz (becomes namespace u2/baz in git) Yeah it looks like a kludge, but all I wanted to do was to show you that it's not entirely true to say the client cannot control the namespace! -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Access different NAMESPACE of remote repo from client side
On 11/15/2013 01:49 PM, Jiang Xin wrote: GIT_NAMESPACE is designed to be used mainly on the server side, that the server can serve multiple git repositories while share one single repository storage using different GIT_NAMESPACE settings. Since we know that one remote repository hosts multiple namespaces, can we handle different namespaces in one local repository? Or can we access the proper namespace of the remote repository without complicated server settings? At least there are three solutions for ssh protocol: pass namespace through environment, pass namespace in URL, or pass namespace from the proper settings of remote.name.receivepack and remote.name.uploadpack. Solution 1: passing the namespace through environment. 1. Set '/etc/sshd_config' in the server side as the following, so that the ssh server can accept GIT_NAMESPACE environment. AcceptEnv LANG LC_* GIT_NAMESPACE 2. In the client side, When connect to ssh server, must send the GIT_NAMESPACE environment. This can be done with a remote-ext url: $ git remote add foo \ 'ext::ssh -o SendEnv=GIT_NAMESPACE git@server %S 'path/to/repo.git' Then the remote foo is GIT_NAMESPACE aware, but when operate on this remote, must provide proper --namespace option. $ git --namespace=foo push foo master $ git --namespace=foo fetch foo $ git --namespace=foo ls-remote foo $ git --namespace=foo remote prune foo $ git --namespace=foo archive --remote foo HEAD But provide a --namespace option is error-prone, but we may invent remote.name.namespace or something to set GIT_NAMESPACE automatically when push to or fetch from remote server. Solution 2: passing the namespace in URL. Again use remote-ext style URL to access remote repository: $ git remote add foo \ 'ext::ssh git@server git --namespace foo %s path/to/repo.git' $ git remote add bar \ 'ext::ssh git@server git --namespace bar %s path/to/repo.git' But if the remote server use a limit shell (such as git-shell or gitolite), the above URLs won't work. This is because these git specific shell (git-shell or gitolite) do not like options. Solution 3: use custom receivepack and uploadpack. e.g. [remote foo] url = ssh://git@server/path/to/repo.git receivepack = git --namespace foo receive-pack uploadpack = git --namespace foo upload-pack fetch = +refs/heads/*:refs/remotes/foo/* [remote bar] url = ssh://git@server/path/to/repo.git receivepack = git --namespace bar receive-pack uploadpack = git --namespace bar upload-pack fetch = +refs/heads/*:refs/remotes/foo/* Just like solution 2, these settings won't work without a patched git-shell or gitolite. Gitolite has a namespaces branch that handles namespaces as described in http://gitolite.com/gitolite/namespaces.html Briefly, it recognises that you can have a main repo off of which several developer might want to hang their logical repos. It also recognises that the actual names of the logical repos will follow some pattern that may include the name of the developer also, and provides a way to derive the name of the physical repo from the logical one. There is an example or two in that link. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Access different NAMESPACE of remote repo from client side
On 11/15/2013 07:55 PM, Sitaram Chamarty wrote: On 11/15/2013 01:49 PM, Jiang Xin wrote: GIT_NAMESPACE is designed to be used mainly on the server side, that the server can serve multiple git repositories while share one single repository storage using different GIT_NAMESPACE settings. Since we know that one remote repository hosts multiple namespaces, can we handle different namespaces in one local repository? Or can we access the proper namespace of the remote repository without complicated server settings? At least there are three solutions for ssh protocol: pass namespace through environment, pass namespace in URL, or pass namespace from the proper settings of remote.name.receivepack and remote.name.uploadpack. Solution 1: passing the namespace through environment. 1. Set '/etc/sshd_config' in the server side as the following, so that the ssh server can accept GIT_NAMESPACE environment. AcceptEnv LANG LC_* GIT_NAMESPACE 2. In the client side, When connect to ssh server, must send the GIT_NAMESPACE environment. This can be done with a remote-ext url: $ git remote add foo \ 'ext::ssh -o SendEnv=GIT_NAMESPACE git@server %S 'path/to/repo.git' Then the remote foo is GIT_NAMESPACE aware, but when operate on this remote, must provide proper --namespace option. $ git --namespace=foo push foo master $ git --namespace=foo fetch foo $ git --namespace=foo ls-remote foo $ git --namespace=foo remote prune foo $ git --namespace=foo archive --remote foo HEAD But provide a --namespace option is error-prone, but we may invent remote.name.namespace or something to set GIT_NAMESPACE automatically when push to or fetch from remote server. Solution 2: passing the namespace in URL. Again use remote-ext style URL to access remote repository: $ git remote add foo \ 'ext::ssh git@server git --namespace foo %s path/to/repo.git' $ git remote add bar \ 'ext::ssh git@server git --namespace bar %s path/to/repo.git' But if the remote server use a limit shell (such as git-shell or gitolite), the above URLs won't work. This is because these git specific shell (git-shell or gitolite) do not like options. Solution 3: use custom receivepack and uploadpack. e.g. [remote foo] url = ssh://git@server/path/to/repo.git receivepack = git --namespace foo receive-pack uploadpack = git --namespace foo upload-pack fetch = +refs/heads/*:refs/remotes/foo/* [remote bar] url = ssh://git@server/path/to/repo.git receivepack = git --namespace bar receive-pack uploadpack = git --namespace bar upload-pack fetch = +refs/heads/*:refs/remotes/foo/* Just like solution 2, these settings won't work without a patched git-shell or gitolite. Gitolite has a namespaces branch that handles namespaces as described in http://gitolite.com/gitolite/namespaces.html Briefly, it recognises that you can have a main repo off of which several developer might want to hang their logical repos. It also recognises that the actual names of the logical repos will follow some pattern that may include the name of the developer also, and provides a way to derive the name of the physical repo from the logical one. There is an example or two in that link. I should add that the Gitolite model is: the user doesn't need to know about namespaces, because namespaces are just things that the server admin is setting up for his own reasons... ...typically because he anticipates several dozens of people cloning the same repo into their namespace and so he expects to save a lot of disk doing this). So in this model we don't really need anything on the client side. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: can we prevent reflog deletion when branch is deleted?
On 11/14/2013 01:37 PM, Jeff King wrote: On Thu, Nov 14, 2013 at 08:56:07AM +0100, Thomas Rast wrote: Whatever it was that happened to a hundred or more repos on the Jenkins project seems to be stirring up this debate in some circles. Making us so curious ... and then you just leave us hanging there ;-) Oh my apologies; I missed the URL!! (But Peff supplied it before I saw this email!) Any pointers to this debate? I do not know about any particular debate in git circles, but I assume Sitaram is referring to this incident: https://groups.google.com/d/msg/jenkinsci-dev/-myjRIPcVwU/t4nkXONp8qgJ in which a Jenkins dev force-pushed and rewound history on 150 different repos. In this case the reflog made rollback easy, but if he had pushed a deletion, it would be harder. I don't know if they had a reflog on the server side; they used client-side reflogs if I understood correctly. I'm talking about server side (bare repo), assuming the site has core.logAllRefUpdates set. And I'll explain the some circles part as something on LinkedIn. To be honest there's been a fair bit of FUDding by CVCS types there so I stopped looking at the posts, but I get the subject lines by email and I saw one that said Git History Protection - if we needed proof... or something like that. I admit I didn't check to see if a debate actually followed that post :-) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: can we prevent reflog deletion when branch is deleted?
On 11/14/2013 04:39 PM, Jeff King wrote: On Thu, Nov 14, 2013 at 04:26:46PM +0530, Sitaram Chamarty wrote: I do not know about any particular debate in git circles, but I assume Sitaram is referring to this incident: https://groups.google.com/d/msg/jenkinsci-dev/-myjRIPcVwU/t4nkXONp8qgJ in which a Jenkins dev force-pushed and rewound history on 150 different repos. In this case the reflog made rollback easy, but if he had pushed a deletion, it would be harder. I don't know if they had a reflog on the server side; they used client-side reflogs if I understood correctly. I'm talking about server side (bare repo), assuming the site has core.logAllRefUpdates set. Yes, they did have server-side reflogs (the pushes were to GitHub, and we reflog everything). Client-side reflogs would not be sufficient, as the client who pushed does not record the history he just rewound (he _might_ have it at refs/remotes/origin/master@{1}, but if somebody pushed since his last fetch, then he doesn't). The simplest way to recover is to just have everyone push again (without --force). The history will just silently fast-forward to whoever has the most recent tip. The downside is that you have to wait for that person to actually push. :) I think they started with that, and then eventually GitHub support got wind of it and pulled the last value for each repo out of the server-side reflog for them. Great. But what does github do if the branches were *deleted* by mistake (say someone does a git push --mirror; most likely in a script, for added fun and laughs!) Github may be able to help people recover from that also, but plain Git won't. And that's what I would like to see a change in. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: can we prevent reflog deletion when branch is deleted?
On 11/14/2013 04:47 PM, Luca Milanesio wrote: Would be really useful anyway to have the ability to create a server-side reference based on a SHA-1, using the Git protocol. Alternatively, just fetching a remote repo based on a SHA-1 (not referenced by any ref-spec but still existent) so that you can create a new reference locally and push. That's a security issue. Just to clarify, what I am asking for is the ability to recover on the server, where you have access to the actual files that comprise the repo. sitaram Luca. On 14 Nov 2013, at 11:09, Jeff King p...@peff.net wrote: On Thu, Nov 14, 2013 at 04:26:46PM +0530, Sitaram Chamarty wrote: I do not know about any particular debate in git circles, but I assume Sitaram is referring to this incident: https://groups.google.com/d/msg/jenkinsci-dev/-myjRIPcVwU/t4nkXONp8qgJ in which a Jenkins dev force-pushed and rewound history on 150 different repos. In this case the reflog made rollback easy, but if he had pushed a deletion, it would be harder. I don't know if they had a reflog on the server side; they used client-side reflogs if I understood correctly. I'm talking about server side (bare repo), assuming the site has core.logAllRefUpdates set. Yes, they did have server-side reflogs (the pushes were to GitHub, and we reflog everything). Client-side reflogs would not be sufficient, as the client who pushed does not record the history he just rewound (he _might_ have it at refs/remotes/origin/master@{1}, but if somebody pushed since his last fetch, then he doesn't). The simplest way to recover is to just have everyone push again (without --force). The history will just silently fast-forward to whoever has the most recent tip. The downside is that you have to wait for that person to actually push. :) I think they started with that, and then eventually GitHub support got wind of it and pulled the last value for each repo out of the server-side reflog for them. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: can we prevent reflog deletion when branch is deleted?
On 11/14/2013 01:44 PM, Jeff King wrote: On Thu, Nov 14, 2013 at 05:48:50AM +0530, Sitaram Chamarty wrote: Is there *any* way we can preserve a reflog for a deleted branch, perhaps under logs/refs/deleted/timestamp/full/ref/name ? I had patches to do something like this here: http://thread.gmane.org/gmane.comp.version-control.git/201715/focus=201752 but there were definitely some buggy corners, as much of the code assumed you needed to have a ref to have a reflog. I don't even run with it locally anymore. At GitHub, we log each change to an audit log in addition to the regular reflog (we also stuff extra data from the environment into the reflog message). So even after a branch is deleted, its audit log entries remain, though you have to pull out the data by hand (git doesn't know about it at all, except as an append-only sink for writing). And git doesn't use the audit log for connectivity, either, so eventually the objects could be pruned. Just some basic protection -- don't delete the reflog, and instead, rename it to something that preserves the name but in a different namespace. That part is easy. Accessing it seamlessly and handling reflog expiration are a little harder. Not because they're intractable, but just because there are some low-level assumptions in the git code. The patch series I mentioned above mostly works. It probably just needs somebody to go through and find the corner cases. The use cases I am talking about are those where someone deleted something and it was noticed well within Git's the earliest of Git's expire timeouts. So, no need to worry about expiry times and connecting it with object pruning. Really, just the eqvt of a cp or mv of one file is all that most people need. Gitolite's log is the same. So no one who uses Gitolite needs this feature. But people shouldn't have to install Gitolite or anything else just to get this either! -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: can we prevent reflog deletion when branch is deleted?
I can't resist... On 11/14/2013 08:12 PM, Stephen Bash wrote: [snipped some stuff from Peff] [snipped 60 lines of python] In honor of your last name, here's what I would do if I needed to log ref updates (and wasn't using Gitolite): #!/bin/bash # -- use this as a post-receive hook while read old new ref do echo $(date +%F %T): Successful push by $USER of $ref from $old to $new done push-log.txt -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: can we prevent reflog deletion when branch is deleted?
[top posting, and not preserving cc's because the original email thread below is just for context; I don't want to force people into a discussion that they may have considered closed :-)] Is there *any* way we can preserve a reflog for a deleted branch, perhaps under logs/refs/deleted/timestamp/full/ref/name ? Whatever it was that happened to a hundred or more repos on the Jenkins project seems to be stirring up this debate in some circles. Just some basic protection -- don't delete the reflog, and instead, rename it to something that preserves the name but in a different namespace. sitaram On 06/01/2013 11:26 PM, Ramkumar Ramachandra wrote: Sitaram Chamarty wrote: I think I'd have to be playing with *several* branches simultaneously before I got to the point of forgetting the branch name! Yeah, I work on lots of small unrelated things: the patch-series I send in are usually the result of few hours of work (upto a few days). I keep the branch around until I've rewritten it for enough re-rolls and am sufficiently sure that it'll hit master. More to the point, your use case may be relevant for a non-bare repo where work is being done, but for a bare repo on a server, I think the branch name *does* have significance, because it's what people are collaborating on. (Imagine someone accidentally nukes a branch, and then someone else tries to git pull and finds it gone. Any recovery at that point must necessarily use the branch name). Ah, you're mostly talking about central workflows. I'm on the other end of the spectrum: I want triangular workflows (and git.git is slowly getting there). However, I might have a (vague) thought on server-side safety in general: I think the harsh dichotomy in ff-only versus non-ff branches is very inelegant. Imposing ff-only feels like a hammer solution, because what happens in practice is different: the `master` does not need to be rewritten most of the time, but I think it's useful to allow some safe rewrites to undo the mistake of checking in an private key or something [*1*]. By safety, I mean that git should give the user easy access to recent dangling objects by annotating it with enough information: sort of like a general-purpose pretty reflog that is gc-safe (configurable trunc_length?). It's a serves more usecases than just the branch-removal problem. Ofcourse, the standard disclaimer applies: there's a high likelihood that I'm saying nonsense, because I've never worked in a central environment. [Footnotes] *1* It turns out that this is not uncommon: https://github.com/search?q=path%3A.ssh%2Fid_rsatype=Coderef=searchresults -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Has there been any discussion about resumable clones recently ?
On 09/20/2013 04:48 AM, shirish शिरीष wrote: Hi all, First of all a big thank you to all for making git. With it being fast and cheap (in relation to bandwidth and sizes for subsequent checkouts as well as CPU usage) . Please CC me if somebody does answer this mail as I'm not subscribed to the list. The thing I have been failures number of times while trying to clone a large repo. The only solution it seems is to ask somebody to make a git-bundle and get that bundle via wget or rsync and then unbundle it Just want to mention that if the server is running gitolite, the admin can set things up so that this is easy and painless, either for all repos or just some specific ones. Such repos can then be cloned like this: rsync -P git@host:repo.bundle . # downloads a file called basename of repo.bundle; repeat as # needed till the whole thing is downloaded git clone repo.bundle repo cd repo git remote set-url origin git@host:repo git fetch origin# and maybe git pull, etc. to freshen the clone (yes, I know this is not really a substitute for resumable clone; call it a stop-gap until that happens!) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] warn about http server document being too old
- describe when it is still applicable - tell people where to go for most normal cases Signed-off-by: Sitaram Chamarty sita...@atc.tcs.com --- ref: http://thread.gmane.org/gmane.comp.version-control.git/159633. Yes it's very old but better late than never. Documentation/howto/setup-git-server-over-http.txt | 5 + 1 file changed, 5 insertions(+) diff --git a/Documentation/howto/setup-git-server-over-http.txt b/Documentation/howto/setup-git-server-over-http.txt index 7f4943e..90b19a0 100644 --- a/Documentation/howto/setup-git-server-over-http.txt +++ b/Documentation/howto/setup-git-server-over-http.txt @@ -3,6 +3,11 @@ Subject: Setting up a Git repository which can be pushed into and pulled from ov Date: Thu, 10 Aug 2006 22:00:26 +0200 Content-type: text/asciidoc +NOTE: This document is from 2006. A lot has happened since then, and this +document is now relevant mainly if your web host is not CGI capable. + +Almost everyone else should instead look at linkgit:git-http-backend[1]. + How to setup Git server over http = -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Locking files / git
On 09/18/2013 01:15 AM, Nicolas Adenis-Lamarre wrote: Ooops. It seems that each time somebody says these two words together, people hate him, and he is scorned by friends and family. However, - gitolite implement it (but gitolite is not git). No. It pretends to implement it, for people who absolutely must have something and are willing to play by the rules. Quoting from the doc [1], When git is used in a truly distributed fashion, locking is impossible. I wrote it as a sort of bandaid, and that is all it is. Implement is too kind a word. regards sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Locking files / git
On 09/18/2013 03:42 PM, Nicolas Adenis-Lamarre wrote: Thanks a lot for your answer. That's really good arguments that i was waiting for and that i have not get until now. My comprehension now : - it's not easy to maintain several versions of a binary file in parallel. So basically, it's not recommanded to have complex workflow for binary files. In case the project has a low number of binary files, it can be handle by simple communication, Yes. Since you mentioned gitolite in your original post, I assume you read this caution also in the doc [1]: Of course, locking by itself is not quite enough. You may still get into merge situations if you make changes in branches. For best results you should actually keep all the binary files in their own branch, separate from the ones containing source code. The point is that locking and distribution don't go together at all. The **core** of distributed VCS is the old coding on an airplane story. What if someone locks a file after I am in the air, and I manage to get in a good 4 hours of solid work? CVCSs can also get into this situation, but to a lesser extent, I think. At least you won't be able to commit! [1]: http://gitolite.com/gitolite/locking.html In case the project has a lot of binary files, a simple workflow with a centralized workflow is recommanded - git doesn't hate locks, it's just that it's not the layer to implement it because git is workflow independant. Locks depend on a centralized server which is directly linked to the workflow. I'm not trying to implement a such workflow. I'm just curious, reading a lot of things about git, and trying to understand what is sometimes called a limitation of git. It's not a limitation of git. It's a fundamental conflict between the idea of distributed and what locking necessitates. A simple line in the documentation to say that locking should be handled in the upper layer (and it's done for example in gitolite) because it's dependant of the workflow could help some people looking about that point. For people who don't realise how important the D in DVCS is, and assume some sort of a central server will always exist, this simple line won't do. You'd have to explain all of that. And for people who do understand it, it's not necessary :-) Thanks a lot for git. 2013/9/17 Fredrik Gustafsson iv...@iveqy.com: On Tue, Sep 17, 2013 at 09:45:26PM +0200, Nicolas Adenis-Lamarre wrote: Ooops. It seems that each time somebody says these two words together, people hate him, and he is scorned by friends and family. For the moment, i want a first feedback, an intermediate between locking is bad and ok, but i would prefer in the negativ answer something with arguments (Take CVS as an example of what not to do; if in doubt, make the exact opposite decision. is one), and in the positiv answer, good remarks about problems with my implementation that could make it better. So working with locks and text-files is generally stupid to do with git since you don't use git merging capabilities. Working with binary files in git is stupid because git doesn't handle them very well because they the deltas can't be calculated very good. It seems to me that if you need to do locking one of the above scenarios is true for you and you should not use git at all. However, there's always the case when you've a mixed project with both binary and text-files. In that case I believe Jeff gave an excellent answer. But think twice, are you using git in a sane way? Even a small binary file will result in a huge git repo if it's updated often and the project has a long history. -- Med vänliga hälsningar Fredrik Gustafsson tel: 0733-608274 e-post: iv...@iveqy.com -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Proposal] Clonable scripts
On 09/10/2013 02:18 AM, Niels Basjes wrote: As we all know the hooks ( in .git/hooks ) are not cloned along with the code of a project. Now this is a correct approach for the scripts that do stuff like emailing the people responsible for releases or submitting the commit to a CI system. For several other things it makes a lot of sense to give the developer immediate feedback. Things like the format of the commit message (i.e. it must start with an issue tracker id) or compliance with a coding standard. Initially I wanted to propose introducing fully clonable (pre-commit) hook scripts. However I can imagine that a malicious opensource coder can create a github repo and try to hack the computer of a contributer via those scripts. So having such scripts is a 'bad idea'. If those scripts were how ever written in a language that is build into the git program and the script are run in such a way that they can only interact with the files in the local git (and _nothing_ outside of that) this would be solved. Also have a builtin scripting language also means that this would run on all operating systems (yes, even Windows). So I propose the following new feature: 1) A scripting language is put inside git. Perhaps a version of python or ruby or go or ... (no need for a 'new' language) 2) If a project contains a folder called .githooks in the root of the code base then the rules/scripts that are present there are executed ONLY on the system doing the actual commit. These scripts are run in such a limited way that they can only read the files in the repository, they cannot do any networking/write to disk/etc and they can only do a limited set op actions against the current operation at hand (i.e. do checks, parse messages, etc). 3) For the regular hooks this language is also support and when located in the (not cloned!) .git/hooks directory they are just as powerful as a normal script (i.e. can control CI, send emails, etc.). Like I said, this is just a proposal and I would like to know what you guys think. I am not in favour of any idea like this. It will end in some sort of compromise (in both sense of the word!) It has to be voluntary, but we can make it easier. I suggest something like this: - some special directory can have normal hook files, but it's just a place holder. - each hook code file comes with some meta data at the top, say githook name, hook name, version, remote-name. I'll use these examples: pre-commit crlf-check 1.1 origin - on a clone/pull, if there is a change to any of these code files when compared to the previous HEAD, and if the program is running interactively, then you can ask and setup these hooks. The purpose of the remote name in the stored metadata is that we don't want to bother updating when we pull from some other repo, like when merging a feature branch. The purpose of the version number is so you can do some intelligent things, even silently upgrade under certain conditions. All we're doing is making things easier compared to what you can already do even now (which is completely manual and instructions based). I don't think anything more intrusive or forced is wise. And people who say it is OK, I'm going to seriously wonder if you work for the NSA (directly or indirectly). Sadly, that is not meant to be a joke question; such is life now. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ephemeral-branches instead of detached-head?
On 08/14/2013 07:14 AM, Junio C Hamano wrote: Sitaram Chamarty sitar...@gmail.com writes: # all reflog entries that are not on a branch, tag, or remote d1 = !gitk --date-order $(git log -g --pretty=%H) --not --branches --tags --remotes # all dangling commits not on a branch, tag, or remote d2 = !gitk --date-order $(git fsck | grep dangling.commit | cut -f3 -d' ') --not --branches --tags --remotes (Apologies if something like this was already said; I was not following the discussion closely enough to notice) Yup. A potential problem is that the output from log -g --pretty=%H or fsck | grep dangling may turn out to be humongous. Other than that, they correctly compute what you want. I thought I mentioned that but I can't find my email now so maybe I didn't. In practice though, I find that, bash at least seems happy to take command lines as long as 7+ million characters long, so with the default reflog expire times, that should work out to 10,000 commits *per day*. [Tested with: echo {100..190} junk; echo `cat junk` | wc] Incidentally, am I the only one who thinks the default values for gc.reflogexpire (90 days) and gc.reflogexpireunreachable (30) should be reversed? In terms of recovering potentially lost commits at least, it seems it would make more sense that something that is UNreachable have a longer expiry, whereas stuff that's reachable... that's only a quick gitk browse away. Design question: is the primary purpose of the reflog what was I doing X days ago or is it I need some code from a commit that got rebased out [or whatever] X days ago? I have always only used the reflog for the latter. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ephemeral-branches instead of detached-head?
On 08/14/2013 12:40 PM, Andres Perera wrote: On Wed, Aug 14, 2013 at 2:02 AM, Sitaram Chamarty sitar...@gmail.com wrote: On 08/14/2013 07:14 AM, Junio C Hamano wrote: Sitaram Chamarty sitar...@gmail.com writes: # all reflog entries that are not on a branch, tag, or remote d1 = !gitk --date-order $(git log -g --pretty=%H) --not --branches --tags --remotes # all dangling commits not on a branch, tag, or remote d2 = !gitk --date-order $(git fsck | grep dangling.commit | cut -f3 -d' ') --not --branches --tags --remotes (Apologies if something like this was already said; I was not following the discussion closely enough to notice) Yup. A potential problem is that the output from log -g --pretty=%H or fsck | grep dangling may turn out to be humongous. Other than that, they correctly compute what you want. I thought I mentioned that but I can't find my email now so maybe I didn't. In practice though, I find that, bash at least seems happy to take command lines as long as 7+ million characters long, so with the default reflog expire times, that should work out to 10,000 commits *per day*. [Tested with: echo {100..190} junk; echo `cat junk` | wc] echo is a builtin in bash, as is the case with other shell implementations builtins may have different limit's than exec()'s ARG_MAX $ getconf ARG_MAX 262144 $ perl -e 'print A x (262144 * 2)' | wc -c 524288 $ perl -e 'print A x (262144 * 2)' | sh -c 'read v; echo $v' | wc -c 524289 $ perl -e 'print A x (262144 * 2)' | sh -c 'read v; /bin/echo $v' | wc -c sh: /bin/echo: Argument list too long 0 builtin's argument buffer limit tends to be aligned with the implementation's lexer buffer limit Aah; good catch -- I did not know this. Thanks! My systems show 2621440 on CentOS 6 and 2097152 on Fedora 19, so -- dividing by 8 (abbrev SHA + space) then by 90, that's still 2900 commits *per day* to run past this limit though! (side note: making a single argument that long seems to have a much lower limit than having multiple arguments: $ /bin/echo `perl -e 'print A x (100)'` | wc -bash: /bin/echo: Argument list too long 0 0 0 $ /bin/echo `perl -e 'print A x (100)'` | wc 1 100 200 notice that the second one is twice as long in terms of bytes, but it's not a single argument). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ephemeral-branches instead of detached-head?
On 08/13/2013 10:19 PM, Junio C Hamano wrote: Duy Nguyen pclo...@gmail.com writes: On Mon, Aug 12, 2013 at 3:37 PM, David Jeske dav...@gmail.com wrote: Is there currently any way to say hey, git, show me what commits are dangling that might be lost in the reflog? How do you define dangling commits? When you do git commit --amend, the current commit will become dangling (in the sense that it's not referred by any ref, but the commit exists) and those are just noise in my opinion. fsck lost-and-found would be one way. It would be nice if we had something like (note: the following will _NOT_ work) git log -g HEAD --not --branches to say walk the reflog of HEAD, but exclude anything that can be reached from the tips of branches. I've been using the following 3 aliases for some time now, to find various dangling stuff. The middle one (d1) seems to do approximately what you want, but will probably fail on repos with lots of activity when the command line length limit is (b)reached. # all stashed entries (since they don't chain) sk = !gitk --date-order $(git stash list | cut -d: -f1) --not --branches --tags --remotes # all reflog entries that are not on a branch, tag, or remote d1 = !gitk --date-order $(git log -g --pretty=%H) --not --branches --tags --remotes # all dangling commits not on a branch, tag, or remote d2 = !gitk --date-order $(git fsck | grep dangling.commit | cut -f3 -d' ') --not --branches --tags --remotes (Apologies if something like this was already said; I was not following the discussion closely enough to notice) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: can we prevent reflog deletion when branch is deleted?
On Sat, Jun 1, 2013 at 11:26 PM, Ramkumar Ramachandra artag...@gmail.com wrote: Sitaram Chamarty wrote: I think I'd have to be playing with *several* branches simultaneously before I got to the point of forgetting the branch name! Yeah, I work on lots of small unrelated things: the patch-series I send in are usually the result of few hours of work (upto a few days). I keep the branch around until I've rewritten it for enough re-rolls and am sufficiently sure that it'll hit master. More to the point, your use case may be relevant for a non-bare repo where work is being done, but for a bare repo on a server, I think the branch name *does* have significance, because it's what people are collaborating on. (Imagine someone accidentally nukes a branch, and then someone else tries to git pull and finds it gone. Any recovery at that point must necessarily use the branch name). Ah, you're mostly talking about central workflows. I'm on the other Yes. Not just because that's what $dayjob does, but also because that's what gitolite does. end of the spectrum: I want triangular workflows (and git.git is slowly getting there). However, I might have a (vague) thought on server-side safety in general: I think the harsh dichotomy in ff-only versus non-ff branches is very inelegant. Imposing ff-only feels like a hammer solution, because what happens in practice is different: the `master` does not need to be rewritten most of the time, but I think it's useful to allow some safe rewrites to undo the mistake of checking in an private key or something [*1*]. By safety, I mean that I suspect that's a big reason for why gitolite is so popular, at least with central workflows. It's trivial to set it up so master is ff-only and any other branch is rewindable etc. git should give the user easy access to recent dangling objects by annotating it with enough information: sort of like a general-purpose pretty reflog that is gc-safe (configurable trunc_length?). It's a serves more usecases than just the branch-removal problem. Again, for central workflow folks, gitolite's log files actually have enough info for all this and more. Coupled with core.logAllRefUpdates, it's possible to recover anything that has not been gc-ed, even deleted branches and tags. But it would be nicer if git's own reflog is able to do that. Hence my original thought about preserving reflogs for deleted refs (even if it is in a graveyard log to resolve the D/F conflict that Michael and Peff were discussing up at the top of the thread). Ofcourse, the standard disclaimer applies: there's a high likelihood that I'm saying nonsense, because I've never worked in a central environment. [Footnotes] *1* It turns out that this is not uncommon: https://github.com/search?q=path%3A.ssh%2Fid_rsatype=Coderef=searchresults Hah! Lovely... -- Sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: can we prevent reflog deletion when branch is deleted?
On Sat, Jun 1, 2013 at 3:17 PM, Ramkumar Ramachandra artag...@gmail.com wrote: Jeff King wrote: Why don't the branch names have significance? If I deleted branch foo yesterday evening, wouldn't I want to be able to say show me foo from 2pm yesterday or even show me all logs for foo, so that I can pick the useful bit from the list? Oh, I misunderstood then. I didn't realize that your usecase was actually git log foo@{yesterday} where foo is a deleted branch. Just to give some perspective, so we don't limit our problem space: I only ever batch-delete cold branches: if I haven't touched a branch in ~2 months, I consider the work abandoned (due to disinterest or otherwise) and remove it. Most of my branches are short-lived, and I don't remember branch names, much less of the names of the cold branches I deleted. My usecase for a graveyeard is I lost something, and I need to find it: I don't want to have to remember the original branch name foo; if you can tell everything I deleted yesterday, I can spot foo and the commit I was looking for. The HEAD reflog is almost good enough for me. I think I'd have to be playing with *several* branches simultaneously before I got to the point of forgetting the branch name! More to the point, your use case may be relevant for a non-bare repo where work is being done, but for a bare repo on a server, I think the branch name *does* have significance, because it's what people are collaborating on. (Imagine someone accidentally nukes a branch, and then someone else tries to git pull and finds it gone. Any recovery at that point must necessarily use the branch name). PS: I am assuming core.logAllRefUpdates is on -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
can we prevent reflog deletion when branch is deleted?
Hi, Is there a way to prevent reflog deletion when the branch is deleted? The last entry could simply be a line where the second SHA is all 0's. -- Sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: propagating repo corruption across clone
On Wed, Mar 27, 2013 at 8:33 PM, Junio C Hamano gits...@pobox.com wrote: Sitaram Chamarty sitar...@gmail.com writes: On Wed, Mar 27, 2013 at 9:17 AM, Junio C Hamano gits...@pobox.com wrote: To be paranoid, you may want to set transfer.fsckObjects to true, perhaps in your ~/.gitconfig. do we have any numbers on the overhead of this? Even a guesstimate will do... On a reasonably slow machine: $ cd /project/git/git.git git repack -a -d $ ls -hl .git/objects/pack/*.pack -r--r--r-- 1 junio src 44M Mar 26 13:18 .git/objects/pack/pack-c40635e5ee2b7094eb0e2c416e921a2b129bd8d2.pack $ cd .. git --bare init junk cd junk $ time git index-pack --strict --stdin ../git.git/.git/objects/pack/*.pack real0m13.873s user0m21.345s sys 0m2.248s That's about 3.2 Mbps? Compare that with the speed your other side feeds you (or your line speed could be the limiting factor) and decide how much you value your data. Thanks. I was also interested in overhead on the server just as a %-age. I have no idea why but when I did some tests a long time ago I got upwards of 40% or so, but now when I try these tests for git.git cd some empty dir git init --bare # git config transfer.fsckobjects true git fetch file:///full/path/to/git.git refs/*:refs/* then, the difference in elapsed time 18s - 22s, so about 22%, and CPU time is 31 - 37, so about 20%. I didn't measure disk access increases, but I guess 20% is not too bad. Is it likely to be linear in the size of the repo, by and large? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git prompt
On Mon, Feb 11, 2013 at 4:24 AM, Junio C Hamano gits...@pobox.com wrote: Jeff King p...@peff.net writes: On Sun, Feb 10, 2013 at 01:25:38PM -0800, Jonathan Nieder wrote: Ethan Reesor wrote: I have a git user set up on my server. It's prompt is set to git-prompt and it's git-shell-commands is empty. [...] How do I make the git user work like github where, upon attempting to get a prompt, the connection is closed? I assume you mean that the user's login shell is git-shell. You can disable interactive logins by removing the ~/git-shell-commands/ directory. Unfortunately that doesn't let you customize the message. Perhaps it would make sense to teach shell.c to look for a [shell] greeting = 'Hi %(username)! You've successfully authenticated, but I do not provide interactive shell access.' setting in git's config file. What do you think? I think something like that makes sense. To my knowledge there is no way with stock git to customize git-shell's output (at GitHub, that message comes from our front-end routing process before you even hit git-shell on our backend machines). The username in our version of the message comes from a database mapping public keys to GitHub users, not the Unix username. But I suspect sites running stock Git would be happy enough to have %(username) map to the actual Unix username. Yeah, that greeting is cute---I like it ;-) Indeed! In gitolite, I borrowed that idea added to it by making it print a list of repos you have access to, along with what permissions (R or RW) you have :-) I'm not suggesting git should do that, but instead of a fixed string, a default command to be executed would be better. That command could do anything the local site wanted to make it do, including something eqvt to what I just said. This of course now means that the ~/git-shell-commands should not be empty, since that is where this default command also will be present. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to diff 2 file revisions with gitk
On Wed, Feb 6, 2013 at 9:27 PM, R. Diez rdiezmail-buspir...@yahoo.de wrote: Hi there: I asked a few days ago whether I could easily diff 2 file revisions with the mouse in gitk, but I got no reply yet, see here: How to diff two file revisions with the mouse (with gitk) https://groups.google.com/forum/#!topic/git-users/9znsQsTB0dE I am hoping that it was the wrong mailing list, and this one the right one. 8-) Here is the full question text again: 8888 I would like to start gitk, select with the mouse 2 revisions of some file and then compare them, hopefully with an external diff tool, very much like I am used to with WinCVS. The closest I got is to start gitk with a filename as an argument, in order to restrict the log to that one file. Then I right-click on a commit (a file revision) and choose Mark this commit. However, if I right-click on another commit and choose Compare with marked commit, I get a full commit diff with all files, and not just the file I specified on the command-line arguments. Selecting a filename in the Tree view and choosing Highlight this only, as I found on the Internet, does not seem to help. I have git 1.7.9 (on Cygwin). Can someone help? By the way, it would be nice if gitk could launch the external diff tool from the Compare with marked commit option too. I don't know if I misunderstood the whole question because the answer is very simple. - start gitk - left click the newer commit - scroll to the older commit - right click the older commit and choose Diff this - selected - in the bottom right pane, pick any file, right click, and choose External diff. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Feature request: Allow extracting revisions into directories
On Mon, Feb 4, 2013 at 10:22 PM, Junio C Hamano gits...@pobox.com wrote: Tomas Carnecky tomas.carne...@gmail.com writes: That's what `git checkout` is for. And I would even argue that it's the better choice in your situation because it would delete files from /var/www/foo which you have deleted in your repo. git archive|tar wouldn't do that. The point about removal is an interesting one. From that /var/www location I guess that you are discussing some webapp, but if you let it _write_ into it, you may also have to worry about how to handle the case where an update from the source end that comes from the checkout and an update by the webapp collide with each other. You also need to maintain the .git/index file that corresponds to what should be in /var/www/foo/ if you go that route. Just to be sure, I am not saying checkout is an inappropriate solution to whatever problem you are trying to solve. I am just pointing out things you need to be aware of if you take that approach. http://sitaramc.github.com/the-list-and-irc/deploy.html is a fairly popular URL on #git, where the bot responds to !deploy with some text and this URL. Just sayin'... :) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Feature request: Allow extracting revisions into directories
On 02/03/2013 07:48 PM, Robert Clausecker wrote: Hello! git currently has the archive command that allows to save an arbitrary revision into a tar or zip file. Sometimes it is useful to not save this revision into an archive but to directly put all files into an arbitrary directory. Currently this seems to be not possible to archive directly; the only way I found to do it is to run git archive and then directly unpack the archive into a directory. git --git-dir REPO archive REVISION | tar x It would be nice to have a command or simply a switch to git archive that allows the user to put the files of REVISION into a directory instead of making an archive. Could you help me understand why piping it to tar (actually 'tar -C /dest/dir -x') is not sufficient to achieve what you want? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Feature request: Allow extracting revisions into directories
On 02/03/2013 11:41 PM, Robert Clausecker wrote: Am Sonntag, den 03.02.2013, 21:55 +0530 schrieb Sitaram Chamarty: Could you help me understand why piping it to tar (actually 'tar -C /dest/dir -x') is not sufficient to achieve what you want? Piping the output of git archive into tar is of course a possible solution; I just don't like the fact that you need to pipe the output into a separate program to do something that should be possible with a simple switch and not an extra command. It feels unintuitive and like a workaround to make an archive just to unpack it on-the-fly. Also, adding such a command (or at least documenting the way to do this using a pipe to tar somewhere in the man pages) is a small and simple change that improves usability. I realise it appears to be the fashion these days to get away from the Unix philosophy of having different tools do different things and combining them as needed. Ignoring the option-heavy GNU, and looking at the more traditional BSD tar manpage [1], I notice the following flags that could still be potentially needed by someone running 'git archive': '-t' (instead of '-x'), '-C dir', '--exclude/include', '-k', '-m', '--numeric-owner', -o, -P, -p, -q, -s, -T, -U, -v, -w, and -X. And I'm ignoring the esoteric ones like --chroot and -S (sparse mode). How many of these options would you like included in git? And if you say I don't need any of those; I just need '-x', that's not relevant. Someone else may need any or all of those flags, and if you accept -x you have to accept some of the others too. Also, I often want to deploy to a different host, and I might do that like so: git archive ... | ssh host tar -C /deploy/dir -x Why not put that ssh functionality into git also? What about computing a checksum and putting out a sha1sums.txt file? People do that also. How about a flag for that? Where does this end? [1]: http://www.unix.com/man-page/FreeBSD/1/tar/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to identify the users?
On 01/31/2013 12:23 PM, Scott Yan wrote: Sitaram: It seems I must host my central repo on Gitolite first... There is no must but yes it is a decent solution and can, in principle, do the kind of checking you want if you set it up to do that. Please note that I don't use that mode and, as my rant would have indicated, I don't think it's a smart thing to do. I don't know Gitolite much, but you are right, maybe I should use Gitolite as my git server. I'll find more documents about gitolite these days, can you give me some suggestion which tutorial should I read? Thanks! ps: my OS is windows. Try http://therightstuff.de/CommentView,guid,b969ea4d-8d2c-42af-9806-de3631f4df68.aspx I normally don't mention blog posts (favouring instead the official documentation) but Windows is an exception. Hence the link. Good luck. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Anybody know a website with up-to-date git documentation?
On Wed, Jan 30, 2013 at 7:28 PM, Max Horn m...@quendi.de wrote: On 30.01.2013, at 12:54, John Keeping wrote: On Wed, Jan 30, 2013 at 12:46:47PM +0100, Max Horn wrote: does anybody know a website where one can view that latest git documentation? Here, latest means latest release (though being also able to access it for next would of course be a nice bonus, likewise for older versions). While I do have those docs on my local machine, I would like to access them online, too (e.g. easier to pointer people at this, I can access it from other machines, etc.). How about http://git-htmldocs.googlecode.com/git/ ? It's just a directory listing of the git-htmldocs repository that Junio maintains - the latest update was yesterday: Autogenerated HTML docs for v1.8.1.2-422-g08c0e. [I didn't know Google Code let you view the repository like that, but I got there by clicking the raw link against one of the files so I assume it's not likely to go away.] Thanks John, that looks pretty good! In addition, I just discovered http://manned.org/git-remote-helpers/2b9e4c86 which contains git docs from Arch Linux, Debian, FreeBSD and Ubuntu packages. And since Arch tends to have the latest, so does manned.org. And best, it even lets me browser to older versions of a file. So, taken together, I guess that solves my problem -- with John's link, I can see the bleeding edge versions, with manned.org the latest released version (as soon as Arch Linux catches up, which seems to be pretty quick :-). I'm curious... what's wrong with 'git checkout html' from the git repo and just browsing them using a web browser? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Anybody know a website with up-to-date git documentation?
On Wed, Jan 30, 2013 at 09:18:24AM -0800, Junio C Hamano wrote: Max Horn m...@quendi.de writes: [administrivia: please wrap lines to a reasonable width] Curiously, gmail's web interface appears to have started doing this only recently. I've noticed it when trying to respond to others too. On 30.01.2013, at 16:59, Sitaram Chamarty wrote: I'm curious... what's wrong with 'git checkout html' from the git repo and just browsing them using a web browser? Hm, do you mean make html, perhaps? At least I couldn't figure out what git checkout html should do, but out of curiosity gave it a try and got an error... Perhaps some information from A note from the maintainer (posted to this list from time to time) is lacking. Some excerpts: You can browse the HTML manual pages at: http://git-htmldocs.googlecode.com/git/git.html Preformatted documentation from the tip of the master branch can be found in: git://git.kernel.org/pub/scm/git/git-{htmldocs,manpages}.git/ git://repo.or.cz/git-{htmldocs,manpages}.git/ ... Armed with that knowledge, I think Sitaram may have something like this: [remote htmldocs] url = git://git.kernel.org/pub/scm/git/git-htmldocs.git/ fetch = +refs/heads/master:refs/heads/html and does git fetch htmldocs git checkout html Hmm; I don't recall ever doing that. But I just realised that my html branch is stuck at 1.7.7: $ git branch -v -v | grep html html 8fb66e5 [origin/html] Autogenerated HTML docs for v1.7.7-138-g7f41b6 Is it possible that upto that point, the main git.git repo did carry this branch also? I have 3 remotes: $ git remote -v gc https://code.google.com/p/git-core (fetch) gc https://code.google.com/p/git-core (push) ghgit git://github.com/git/git.git (fetch) ghgit git://github.com/git/git.git (push) origin git://git.kernel.org/pub/scm/git/git.git (fetch) origin git://git.kernel.org/pub/scm/git/git.git (push) and all 3 of them carry this branch: $ git branch -a -v -v | grep html html 8fb66e5 [origin/html] Autogenerated HTML docs for v1.7.7-138-g7f41b6 remotes/gc/html 8fb66e5 Autogenerated HTML docs for v1.7.7-138-g7f41b6 remotes/ghgit/html8fb66e5 Autogenerated HTML docs for v1.7.7-138-g7f41b6 remotes/origin/html 8fb66e5 Autogenerated HTML docs for v1.7.7-138-g7f41b6 Even if I had, at one point, added a remote specifically for html, I am sure it could not have created those refs! So I tried a prune: $ git remote update --prune Fetching origin x [deleted] (none) - origin/html x [deleted] (none) - origin/man Fetching ghgit x [deleted] (none) - ghgit/html x [deleted] (none) - ghgit/man Fetching gc x [deleted] (none) - gc/html x [deleted] (none) - gc/man and now I'm on par with the rest of you ;-) You can, too, of course ;-) You can do even more! If you don't find a suitable website for this, it's trivial to host it yourself. You can host it on your intranet, if you have one. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: missing objects -- prevention
On Sat, Jan 12, 2013 at 6:43 PM, Jeff King p...@peff.net wrote: On Sat, Jan 12, 2013 at 06:39:52AM +0530, Sitaram Chamarty wrote: 1. The repo has a ref R pointing at commit X. 2. A user starts a push to another ref, Q, of commit Y that builds on X. Git advertises ref R, so the sender knows they do not need to send X, but only Y. The user then proceeds to send the packfile (which might take a very long time). 3. Meanwhile, another user deletes ref R. X becomes unreferenced. The gitolite logs show that no deletion of refs has happened. To be pedantic, step 3 could also be rewinding R to a commit before X. Anything that causes X to become unreferenced. Right, but there were no rewinds also; I should have mentioned that. (Gitolite log files mark rewinds and deletes specially, so they're easy to search. There were two attempted rewinds but they failed the gitolite update hook so -- while the new objects would have landed in the object store -- the old ones were not dereferenced). There is a race with simultaneously deleting and packing refs. It doesn't cause object db corruption, but it will cause refs to rewind back to their packed versions. I have seen that one in practice (though relatively rare). I fixed it in b3f1280, which is not yet in any released version. This is for the packed-refs file right? And it could result in a ref getting deleted right? Yes, if the ref was not previously packed, it could result in the ref being deleted entirely. I said above that the gitolite logs say no ref was deleted. What if the ref deletion happened because of this race, making the rest of your 4-step scenario above possible? It's possible. I do want to highlight how unlikely it is, though. Agreed. up in the middle, or fsck rejects the pack). We have historically left fsck... you mean if I had 'receive.fsckObjects' true, right? I don't. Should I? Would it help this overall situation? As I understand it, thats only about the internals of each object to check corruption, and cannot detect a *missing* object on the local object store. Right, I meant if you have receive.fsckObjects on. It won't help this situation at all, as we already do a connectivity check separate from the fsck. But I do recommend it in general, just because it helps catch bad objects before they gets disseminated to a wider audience (at which point it is often infeasible to rewind history). And it has found git bugs (e.g., null sha1s in tree entries). I will add this. Any idea if there's a significant performance hit? At GitHub, we've taken to just cleaning them up aggressively (I think after an hour), though I am tempted to put in an optional signal/atexit OK; I'll do the same then. I suppose a cron job is the best way; I didn't find any config for expiring these files. If you run git prune --expire=1.hour.ago, it should prune stale tmp_pack_* files more than an hour old. But you may not be comfortable with such a short expiration for the objects themselves. :) Thanks again for your help. I'm going to treat it (for now) as a disk/fs error after hearing from you about the other possibility I mentioned above, although I find it hard to believe one repo can be hit buy *two* races occurring together! Yeah, the race seems pretty unlikely (though it could be just the one race with a rewind). As I said, I haven't actually ever seen it in practice. In my experience, though, disk/fs issues do not manifest as just missing objects, but as corrupted packfiles (e.g., the packfile directory entry ends up pointing to the wrong inode, which is easy to see because the inode's content is actually a reflog). And then of course with the packfile unreadable, you have missing objects. But YMMV, depending on the fs and what's happened to the machine to cause the fs problem. That's always the hard part. System admins (at the Unix level) insist there's nothing wrong and no disk errors and so on... that is why I was interested in network errors causing problems and so on. Anyway, now that I know the tmp_pack_* files are caused mostly by failed pushes than by failed auto-gc, at least I can deal with the immediate problem easily! Thanks once again for your patient replies! sitaram -- Sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: missing objects -- prevention
Thanks for the very detailed answer. On Fri, Jan 11, 2013 at 10:12 PM, Jeff King p...@peff.net wrote: On Fri, Jan 11, 2013 at 04:40:38PM +0530, Sitaram Chamarty wrote: I find a lot of info on how to recover from and/or repair a repo that has missing (or corrupted) objects. What I need is info on common reasons (other than disk errors -- we've checked for those) for such errors to occur, any preventive measures we can take, and so on. I don't think any race can cause corruption of the object or packfiles because of the way they are written. At GitHub, every case of file-level corruption we've seen has been a filesystem issue. So I think the main thing systemic/race issue to worry about is missing objects. And since git only deletes objects during a prune (assuming you are using git-gc or repack -A so that repack cannot drop objects), I think prune is the only thing to watch out for. No one runs anything manually under normal conditions. If there's any gc happening, it's gc --auto. The --expire time saves us from the obvious races where you write object X but have not yet referenced it, and a simultaneous prune wants to delete it. However, it's possible that you have an old object that is unreferenced, but would become referenced as a result of an in-progress operation. For example, commit X is unreferenced and ready to be pruned, you build commit Y on top of it, but before you write the ref, git-prune removes X. The server-side version of that would happen via receive-pack, and is even more unlikely, because X would have to be referenced initially for us to advertise it. So it's something like: 1. The repo has a ref R pointing at commit X. 2. A user starts a push to another ref, Q, of commit Y that builds on X. Git advertises ref R, so the sender knows they do not need to send X, but only Y. The user then proceeds to send the packfile (which might take a very long time). 3. Meanwhile, another user deletes ref R. X becomes unreferenced. The gitolite logs show that no deletion of refs has happened. 4. After step 3 but before step 2 has finished, somebody runs prune (this might sound unlikely, but if you kick off a gc job after each push, or after N pushes, it's not so unlikely). It sees that X is unreferenced, and it may very well be older than the --expire setting. Prune deletes X. 5. The packfile in (2) arrives, and receive-pack attempts to update the refs. So it's even a bit more unlikely than the local case, because receive-pack would not otherwise build on dangling objects. You have to race steps (2) and (3) just to create the situation. Then we have an extra protection in the form of check_everything_connected, which receive-pack runs before writing the refs into place. So if step 4 happens while the packfile is being sent (which is the most likely case, since it is the longest stretch of receive-pack's time), we would still catch it there and reject the push (annoying to the user, but the repo remains consistent). However, that's not foolproof. We might hit step 4 after we've checked that everything is connected but right before we write the ref. In which case we drop X, which has just become referenced, and we have a missing object. So I think it's possible. But I have never actually seen it in practice, and come up with this scenario only by brainstorming what could go wrong scenarios. This could be mitigated if there was a proposed refs storage. Receive-pack would write a note saying consider Y for pruning purposes, but it's not really referenced yet, check connectivity for Y against the current refs, and then eventually write Y to its real ref (or reject it if there are problems). Prune would either run before the proposed note is written, which would mean it deletes X, but the connectivity check fails. Or it would run after, in which case it would leave X alone. For example, can *any* type of network error or race condition cause this? (Say, can one push writes an object, then fails an update check, and a later push succeeds and races against a gc that removes the unreachable object?) Or... the repo is pretty large -- about 6-7 GB, so could size cause a race that would not show up on a smaller repo? The above is the only open issue I know about. I don't think it is dependent on repo size, but the window is widened for a really large push, because rev-list takes longer to run. It does not widen if you have receive.fsckobjects set, because that happens before we do the connectivity check (and the connectivity check is run in a sub-process, so the race timer starts when we exec rev-list, which may open and mmap packfiles or otherwise cache the presence of X in memory). Anything else I can watch out for or caution the team about? That's the only open issue I know about for missing objects. There is a race with simultaneously deleting and packing
Re: Pushing symbolic references to remote repositories?
On Sat, Dec 22, 2012 at 11:57 PM, Junio C Hamano gits...@pobox.com wrote: Andreas Schwab sch...@linux-m68k.org writes: This is not limited to HEAD, any ref may want to be set up as a symref at a remote repo. For example, I want to set up a symref master - trunk at a repository I have no shell access to. That is exactly the hosting side does not give you an easy way so pushing seems to be one plausible but not necessarily has to be the only way case, so it is already covered in the discussion. Just a minor FYI (and at the risk of tooting my own horn) but if the hosting side is gitolite, you can.set it up so that any user with write permissions to the repo can run 'git symbolic-ref' with arbitrary arguments even though he does not get a shell. The -m reason has some constraints because gitolite does not allow a lot of characters in arguments to remote commands but that's mostly useless unless you have core.logAllRefUpdates set anyway. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
bug? 'git log -M100' is different from 'git log -M100%'
Hi, When using -M with a number to act as a threshold for declaring a change as being a rename, I found a... quirk. Any 2-digit number after the M will work, but if the number is 100, it will require a % to be appended to be effective. Here's a transcript that will demonstrate the problem when run in an empty directory. # setup git init seq 1 100 f1 git add f1 git commit -m f1 git rm f1 ( seq 1 45; seq 1001 1010; seq 56 100 ) f2 git add f2 git commit -m f2 # here's the buglet git log -1 --stat --raw -M # this tells you the files are 83% similar git log -1 --stat --raw -M82 # this shows it like a rename git log -1 --stat --raw -M83 # this also git log -1 --stat --raw -M84 # this shows two separate files git log -1 --stat --raw -M99 # this also # so far so good... git log -1 --stat --raw -M100 # but this shows it like a rename git log -1 --stat --raw -M100% # adding a percent sign fixes it, now they're two separate # files. It seems to be required only when you ask for 100% -- Sitaram Chamarty -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] clarify -M without % symbol in diff-options
--- Documentation/diff-options.txt | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt index f4f7e25..39f2c50 100644 --- a/Documentation/diff-options.txt +++ b/Documentation/diff-options.txt @@ -309,7 +309,11 @@ endif::git-log[] index (i.e. amount of addition/deletions compared to the file's size). For example, `-M90%` means git should consider a delete/add pair to be a rename if more than 90% of the file - hasn't changed. + hasn't changed. Without a `%` sign, the number is to be read as + a fraction, with a decimal point before it. I.e., `-M5` becomes + 0.5, and is thus the same as `-M50%`. Similarly, `-M05` is + the same as `-M5%`. To limit detection to exact renames, use + `-M100%`. -C[n]:: --find-copies[=n]:: -- 1.7.11.7 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bug? 'git log -M100' is different from 'git log -M100%'
On Tue, Dec 18, 2012 at 6:55 AM, Junio C Hamano gits...@pobox.com wrote: Sitaram Chamarty sitar...@gmail.com writes: When using -M with a number to act as a threshold for declaring a change as being a rename, I found a... quirk. Any 2-digit number after the M will work,... That is not 2-digit number. A few historical trivia may help. Originally we said you can use -M2 to choose 2/10 (like gzip taking compression levels between -0 to -9). Then Linus came up with a clever idea to let people specify arbitrary precision by letting you say -M25 to mean 25/100 and -M254 to mean 254/1000. Read the numbers without per-cent as if it has decimal point before it (i.e. -M005 is talking about 0.005 which is 0.5%). Full hundred per-cent has to be spelled with per-cent sign for obvious reasons with this scheme but that cannot be avoided. It is a special case. Oh nice. Makes sense; thanks! I submitted a patch to diff-options.txt (separately). regards sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Python extension commands in git - request for policy change
On Tue, Dec 11, 2012 at 11:14 AM, Patrick Donnelly batr...@batbytes.com wrote: Sorry I'm late to this party... I'm an Nmap developer that is casually interested in git development. I've been lurking for a while and thought I'd post my thoughts on this thread. On Sun, Nov 25, 2012 at 6:25 AM, Nguyen Thai Ngoc Duy pclo...@gmail.com wrote: The most important issues to consider when imagining a future with a hybrid of code in C and some scripting language X are: * Portability: is X available on all platforms targeted by git, in usable and mutually-compatible versions? * Startup time: Is the time to start the X interpreter prohibitive? (On my computer, python -c pass, which starts the Python interpreter and does nothing, takes about 24ms.) This overhead would be incurred by every command that is not pure C. * Should the scripting language access the C functionality only by calling pure-C executables or by dynamically or statically linking to a binary module interface? If the former, then the granularity of interactions between X and C is necessarily coarse, and X cannot be used to implement anything but the outermost layer of functionality. If the latter, then the way would be clear to implement much more of git in X (and lua would also be worth considering). * Learning curve for developers: how difficult is it for a typical git developer to become conversant with X, considering both (1) how likely is it that the typical git developer already knows X and (2) how straightforward and predictable is the language X? In this category I think that Python has a huge advantage over Perl, though certainly opinions will differ and Ruby would also be a contender. * We might also need an embedded language variant, like Jeff's lua experiment. I'd be nice if X can also take this role. Lua has been an incredible success for Nmap [2](and other projects). As an embedded scripting language, it's unrivaled in terms of ease of embedding, ease of use for users, and performance. I would strongly recommend the git developers to seriously consider it. [snipping the rest; all valid points no doubt] Does lua have os.putenv() yet? The inability to even *set* an env var before calling something else was a killer for me when I last tried it. That may make it fine as an embedded language (called *by* something else) but it is a bit too frugal to use as a glue language (calls other things). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: does a successful 'git gc' imply 'git fsck'
On Sun, Dec 2, 2012 at 3:01 PM, Junio C Hamano gits...@pobox.com wrote: Sitaram Chamarty sitar...@gmail.com writes: If I could assume that a successful 'git gc' means an fsck is not needed, I'd save a lot of time. Hence my question. When it does repack -a, it at least scans the whole history so you would be sure that all the commits and trees are readable for the purpose of enumerating the objects referred by them (and a bit flip in them will likely be noticed by zlib inflation). But a gc does not necessarily run repack -a when it does not see too many pack files, so it can end up scanning only the surface of the history to collect the recently created loose objects into a pack, and stop its traversal without going into existing packfiles. Thanks; I'd missed this nuance as well... -- Sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: does a successful 'git gc' imply 'git fsck'
On Sun, Dec 2, 2012 at 9:58 AM, Shawn Pearce spea...@spearce.org wrote: On Sat, Dec 1, 2012 at 6:31 PM, Sitaram Chamarty sitar...@gmail.com wrote: Background: I have a situation where I have to fix up a few hundred repos in terms of 'git gc' (the auto gc seems to have failed in many cases; they have far more than 6700 loose objects). I also found some corrupted objects in some cases that prevent the gc from completing. I am running git gc followed by git fsck. The majority of the repos I have worked through so far appear to be fine, but in the larger repos (upwards of 2-3 GB) the git fsck is taking almost 5 times longer than the 'gc'. If I could assume that a successful 'git gc' means an fsck is not needed, I'd save a lot of time. Hence my question. Not really. For example fsck verifies that every blob when decompressed and fully inflated matches its SHA-1. gc only checks OK that makes sense. After I posted I happened to check using strace and kinda guessed this from what I saw, but it's nice to have confirmation. connectivity of the commit and tree graph by making sure every object was accounted for. But when creating the output pack it only verifies a CRC-32 was correct when copying the bits from the source to the destination, it does not verify that the data decompresses and matches the SHA-1 it should match. So it depends on what level of check you need to feel safe. Yup; thanks. All the repos my internal client manages are mirrored in multiple places, and they set (or were at least told to set, heh!) receive.fsckObjects so the lesser check is fine in most cases. -- Sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Python extension commands in git - request for policy change
On Tue, Nov 27, 2012 at 1:24 PM, David Aguilar dav...@gmail.com wrote: *cough* git-cola *cough* it runs everywhere. Yes, windows too. It's written in python. It's been actively maintained since 2007. It's modern and has features that don't exist anywhere else. It even has tests. It even comes with a building full of willing guinea-pigs^Wtesters that let me know right away when anything goes wrong. It uses Qt but that's really the whole point of Qt - cross-platform. (not sure how that wiki page ended up saying Gnome/GTK?) The DAG aka git-dag (in its master branch, about to be released) is nicer looking then gitk IMO. gitk still has some features that are better too--there's no silver bullet, but the delta is pretty small. Gitk does a lot of things that people don't realise, since they're not really documented and you have to scrounge around on the UI. The thing is, it's just about the most awesome tool for code archeology I have seen. I realise (from looking at the doc page) that git-cola helps you do all sorts of things, but those are all things I am happier doing at the command line. Gitk does precisely those things which *require* a GUI, where the amount of information presented overwhelms a text interface. The display is concisely designed to give you the maximum information at a minimum space use. For example, a little black square when a commit has a note attached. Even hovering over the arrow-heads, on complex trees where the line gets broken, does something meaningful. if I had to pin it down, the feature I use most often is Show origin of this line. Other features I use often are - review a commit file by file (f and b keys, also spacebar and 'd') - search by SHA1 (4 digits appear to be enough, regardless of how big your repo is), - search for commits changing path/dir (while still showing all the commits; i.e., this is not 'git-dag -- README.txt' but within gitk you search up and down for commits touching README.txt - and navigating the commit tree looking for stuff http://sitaramc.github.com/1-basic-usage/gitk.html is my attempt to document some of the stuff I have found and use. One final point: the DAG on the right wastes enormous amounts of space. Purely subjectively, it is almost jarring on the senses. (If you reduce it, it becomes unreadable). With all due respect, git-cola/dag isn't anywhere near what gitk does, at least for people who are not afraid of the command line and only need the GUI to visualise a truly complex tree. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Python extension commands in git - request for policy change
On Wed, Nov 28, 2012 at 12:05 AM, Eric S. Raymond e...@thyrsus.com wrote: Magnus Bäck ba...@google.com: While constant traffic probably overstates the issue, these are not theoretical problems. I recall at least three cases in the last year or so where Git has seen breakage with Solaris or Mac OS X because of sed or tr incompatibilities, and I don't even read this list that thoroughly. This is exactly the sort of of pain experience would lead me to expect. OK, this is where I assume the full Old Fart position (30-year old-school Unix guy, author of The Art of Unix Programming, can remember the years before Perl and still has sh idioms in his fingertips) and say Get with the 21st century, people! Or at least 1990... As a general scripting language shell sucks *really badly* compared to anything new-school. Performance, portability, you name it, it's a mess. It's not so much the shell interpreters itself that are the portabilty problem, but (as Magnus implicitly points out) all those userland dependencies on sed and tr and awk and even variants of expr(!) that get dragged in the second you try to get any actual work done. Not always. There are several situations where a shell script that makes good use of grep, cut, etc., is definitely much cleaner and more elegant than anything you can do in a propah programming language. If the price of doing that is sticking to a base set of primitives, it's a small price to pay, not much different from sticking to python 2.7 or perl 5.8 or whatever. Shell *is* the universal scripting language, not perl (even though we all know it is what God himself used to create the world -- see xkcd 224 if you don't believe me!), not python, not Ruby. -- sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Python extension commands in git - request for policy change
On Mon, Nov 26, 2012 at 4:17 AM, Eric S. Raymond e...@thyrsus.com wrote: Krzysztof Mazur krzys...@podlesie.net: What about embedded systems? git is also useful there. C and shell is everywhere, python is not. Supposing this is true (and I question it with regard to shell) if you tell me how you live without gitk and the Perl pieces I'll play that right back at you as your answer. gitk is unlikely to be used on an embedded system, the perl pieces more so. I have never understood why people complain about readability in perl. Just because it uses the ascii charset a bit more? You expect a mathematician or indeed any scientist to use special symbols to mean special things, why not programmers? Perhaps people should be forced to use COBOL for a few years (like I did, a long while ago) to appreciate brevity. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Local clones aka forks disk size optimization
On Fri, Nov 16, 2012 at 11:34 PM, Enrico Weigelt enrico.weig...@vnc.biz wrote: Provide one main clone which is bare, pulls automatically, and is there to stay (no pruning), so that all others can use that as a reliable alternates source. The problem here, IMHO, is the assumption, that the main repo will never be cleaned up. But what to do if you dont wanna let it grow forever ? That's not the only problem. I believe you only get the savings when the main repo gets the commits first. Which is probably ok most of the time but it's worth mentioning. hmm, distributed GC is a tricky problem. Except for one little issue (see other thread, subject line cloning a namespace downloads all the objects), namespaces appear to do everything we want in terms of the typical use cases for alternates, and/or 'git clone -l', at least on the server side. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: use cases for git namespaces
On Thu, Nov 15, 2012 at 2:03 PM, Sitaram Chamarty sitar...@gmail.com wrote: Hi, It seems to me that whatever namespaces can do, can functionally be done using just a subdirectory of branches. The only real differences I can see are (a) a client sees less branch clutter, and (b) a fetch/clone pulls down less if the big stuff is in another namespace. I would like to understand what other uses/reasons were thought of. (I should mention that I am asking from the client perspective. I know that on the server this has potential to save a lot of disk space). I looked for discussion on the ml archives. I found the patch series but could not easily find much *discussion* of the feature and its design. I found one post [1] that indicated that part of the rationale... (being what I described above), but I would like to understand the *rest* of the rationale. Pointers to gmane are also fine, or brief descriptions of uses [being] made of this. [1]: http://article.gmane.org/gmane.comp.version-control.git/175832/match=namespace Thanks -- Sitaram -- Sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Local clones aka forks disk size optimization
On Thu, Nov 15, 2012 at 7:04 AM, Andrew Ardill andrew.ard...@gmail.com wrote: On 15 November 2012 12:15, Javier Domingo javier...@gmail.com wrote: Hi Andrew, Doing this would require I got tracked which one comes from which. So it would imply some logic (and db) over it. With the hardlinking way, it wouldn't require anything. The idea is that you don't have to do anything else in the server. I understand that it would be imposible to do it for windows users (but using cygwin), but for *nix ones yes... Javier Domingo Paraphrasing from git-clone(1): When cloning a repository, if the source repository is specified with /path/to/repo syntax, the default is to clone the repository by making a copy of HEAD and everything under objects and refs directories. The files under .git/objects/ directory are hardlinked to save space when possible. To force copying instead of hardlinking (which may be desirable if you are trying to make a back-up of your repository) --no-hardlinks can be used. So hardlinks should be used where possible, and if they are not try upgrading Git. I think that covers all the use cases you have? I am not sure it does. My understanding is this: 'git clone -l' saves space on the initial clone, but subsequent pushes end up with the same objects duplicated across all the forks (assuming most of the forks keep up with some canonical repo). The alternates mechanism can give you ongoing savings (as long as you push to the main repo first), but it is dangerous, in the words of the git-clone manpage. You have to be confident no one will delete a ref from the main repo and then do a gc or let it auto-gc. He's looking for something that addresses both these issues. As an additional idea, I suspect this is what the namespaces feature was created for, but I am not sure, and have never played with it till now. Maybe someone who knows namespaces very well will chip in... -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git-clone and unreliable links?
On Wed, Nov 7, 2012 at 9:24 PM, Shawn Pearce spea...@spearce.org wrote: On Wed, Nov 7, 2012 at 7:35 AM, Josef Wolf j...@raven.inka.de wrote: When using git-clone over an unreliable link (say, UMTS) and the network goes down, git-clone deletes everything what was downloaded. When the network goes up again and you restart git-clone, it has to start over from the beginning. Then, eventually, the network goes down again, and everything is deleted again. Is there a way to omit the deleting step, so the second invocation would start where the first invocation was interrupted? No, because a clone is not resumable. The best way to obtain a repository over an unstable link is to ask the repository owner to make a bundle file with `git bundle create --heads --tags` and serve the file using standard HTTP or rsync, which are resumable protocols. After you download the file, you can clone or fetch from the bundle to initialize your local repository, and then run git fetch to incrementally update to anything that is more recent than the bundle's creation. If the server is running gitolite, the admin can set it up so that a bundle file is automatically created as needed (including don't do it more than once per duration logic), and serve it up over rsync using the same ssh credentials as for access to the repo itself. However, this is not particularly useful for systems with git://, although it could certainly be *adapted* for http access. [Documentation is inline, in src/commands/rsync, for people who wish to know.] -- Sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: In search of a version control system
On Sun, Oct 21, 2012 at 5:50 PM, Drew Northup n1xim.em...@gmail.com wrote: On Tue, Oct 9, 2012 at 1:58 AM, Matthieu Moy matthieu@grenoble-inp.fr wrote: David Aguilar dav...@gmail.com writes: I would advise against the file locking, though. You ain't gonna need it ;-) What do you suggest to merge Word files? If the files are in the DOCX format you can just expand them as zip archives and diff what's inside of them. The text in particular is stored as XML. You also need a merge driver that at least splits the all in one single very very long line XML into different lines in some way. I don't think git can merge even text files if everything is on one line in each file. And even if you do this I don't think the result will be a valid ODT etc file. All in all, I prefer the locking that David mentioned [1]. And if your users cannot be trained to check first (as that URL describes), then you probably have to use a CVCS that supports some stronger form of locking. [1]: http://sitaramc.github.com/gitolite/locking.html -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: feature request
On Tue, Oct 16, 2012 at 10:57 PM, Angelo Borsotti angelo.borso...@gmail.com wrote: Hi Andrew, one nice thing is to warn a developer that wants to modify a source file, that there is somebody else changing it beforehand. It is nicer than discovering that at push time. Andrew: also see http://sitaramc.github.com/gitolite/locking.html for a way to do file locking (and enforce it) using gitolite. This does warn, as long as the user remembers to try to acquire a lock before working on a binary file. (You can't get around that requirement on a DVCS, sorry!) Take into account that there are changes in files that may be incompatible to each other, or that can be amenable to be automatically merged producing wrong results. So, knowing it could help. -Angelo -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Why git shows staging area to users?
On Sun, Oct 14, 2012 at 2:38 AM, Yi, EungJun semtlen...@gmail.com wrote: Hi, all. Why git shows staging area to users, compared with the other scms hide it? What benefits users get? I feel staging area is useful, but it is difficult to explain why when someone asks me about that. I wrote this a long time ago, more for my understanding than otherwise, but maybe it is useful: http://sitaramc.github.com/concepts/uses-of-index.html -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A basic question
On Thu, Oct 11, 2012 at 11:08 PM, Jim Vahl j...@wmdb.com wrote: Drew, Thanks for responding to my email! Yes, I did read most of the Book, although I admit that I skimmed over some of the more technical parts. There is still a key part of how git is used in a commercial environment which I don't understand. When we release a new version of our product, it is comprised of over a hundred files. Some of these files have not changed for years, and some have been revised/fixed/updated quite recently. But what is key is that all of these components have passed a review and testing process. A very important piece of information is what revision of each file made it into the release. I'm afraid I don't have anything to add to what was already said but I can't resist asking: are you coming from clearcase? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Fwd: potential path traversal issue in v3 with wild repos
oops; forgot to add the git list earlier. -- Forwarded message -- From: Sitaram Chamarty sitar...@gmail.com Date: Wed, Oct 10, 2012 at 5:15 AM Subject: potential path traversal issue in v3 with wild repos To: gitolite gitol...@googlegroups.com, gitolite-annou...@googlegroups.com Cc: Stephane Chazelas stephane.chaze...@gmail.com Hello all, I'm sorry to say there is a potential path traversal vulnerability in v3. Thanks to Stephane (copied) for finding it and alerting me. Can it affect you? This can only affect you if you are using wild card repos, *and* at least one of your patterns allows the string ../ to match multiple times. (As far as I can tell, this does not affect v2). How badly can it affect you? A malicious user who *also* has the ability to create arbitrary files in, say, /tmp (e.g., he has his own userid on the same box), can compromise the entire git user. Otherwise the worst he can do is create arbitrary repos in /tmp. What's the fix? The fix has been pushed, and tagged v3.1. This patch [1] is also backportable to all v3.x tags, in case you are not ready for an actual upgrade (it will have a slight conflict with v3.0 and v3.01 but you should be able to resolve it easily enough; with the others there is no conflict.) My sincere apologies for the fsck-up :( [1]: it's not the top-most commit; be careful if you want to 'git cherry-pick' this to an older tree. -- Sitaram -- Sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git clone algorithm
On Tue, Oct 9, 2012 at 10:53 PM, Bogdan Cristea crist...@gmail.com wrote: I have already posted this message on git-us...@googlegroups.com but I have been advised to rather use this list. I know that there is a related thread (http://thread.gmane.org/gmane.comp.version-control.git/207257), but I don't think that this provides an answer to my question (me too I am on a slow 3G connection :)) I am wondering what algorithm is used by git clone command ? When cloning from remote repositories, if there is a link failure and the same command is issued again, the process should be smart enough to figure out what objects have been already transferred locally and restart the cloning process from the point it has been interrupted. As far as I can tell this is not the case, each time I have restarted the cloning process everything started from the beginning. This is extremely annoying with slow, unreliable connections. Are there any ways to cope with this situation or any future plans ? This is not an answer to your question in the general case, sorry... Admins who are managing a site using gitolite can set it up to automatically create and maintain bundle files, and allow them to be downloaded using rsync (which, as everyone knows, is resumable), using the same authentication and access rules as gitolite itself. Once you add a couple of lines to the gitolite.conf, it's all pretty much self-maintaining. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Ignore on commit
On Fri, Oct 5, 2012 at 7:05 AM, demerphq demer...@gmail.com wrote: On 5 October 2012 03:00, Andrew Ardill andrew.ard...@gmail.com wrote: On 5 October 2012 07:20, Marco Craveiro marco.crave...@gmail.com wrote: ... Similar but not quite; the idea is that you know that there is some code (I'm just talking about files here, so lets ignore hunks for the moment) which is normally checked in but for a period of time you want it ignored. So you don't want it git ignored but at the same time you don't want to see these files in the list of modified files. What is the reason git ignore is no good in this case? Is it simply that you can't see the ignored files in git status, or is it that adding and removing entries to .gitignore is too cumbersome? If it's the latter you could probably put together a simple shell wrapper to automate the task, as otherwise it seems like git ignore does what you need. Git ignore doesn't ignore tracked files. would 'git update-index --assume-unchanged' work in this case? Didn't see it mentioned in any of the replies so far (but I have never used it myself) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to update a cloned git repository
On Tue, Sep 11, 2012 at 4:47 PM, Joachim Schmitz j...@schmitz-digital.de wrote: Like this? git pull --rebase HEAD~42 So far I create patches, wiped out the entire repository, cloned, forked and applied the changes, pretty painful. I think a 'git pull --rebase' should usually work even for 'pu'. But sometimes pu may have changes that take away the basis for your patch (i.e, not just a restructure) then you'd get conflicts. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Clone to an SSH destination
On Mon, Sep 3, 2012 at 6:45 PM, Mark Hills mark.hi...@framestore.com wrote: On Mon, 3 Sep 2012, Sitaram Chamarty wrote: On Mon, Sep 3, 2012 at 5:17 PM, Konstantin Khomoutov flatw...@users.sourceforge.net wrote: On Mon, 3 Sep 2012 11:21:43 +0100 (BST) Mark Hills mark.hi...@framestore.com wrote: [snip] This is quite cumbersome; we have a large team of devs who use a simple 'git clone' to an NFS directory, but we wish to retire NFS access. [snip] gitolite kind of implements this (wild repos) [1], you could look if it suits your needs. The simplest conf to do what you want in gitolite is something like this: repo [a-zA-Z0-9]..* C = @all RW+ = @all But of course your *user* authentication will probably change quite a bit, since gitolite runs as one Unix user and merely simulates many gitolite users, while in the NFS method each of your devs probably has a full login to the server. I'll check out gitolite, thanks. We use unix users extensively (groups, permissions etc.) with YP, and this works well; a separate permissions scheme is not very desireable. The ssh method works very well right now, and nicely transparent. It's only the initial clone/creation that is harder than it was over NFS. And it prevents the use of git-shell too. If I had to do this, and didn't want to use gitolite or something like it, I'd just make a script that will create the repo using an ssh call then do a 'git push --mirror' to it. Call it git-new or something and train people to use that instead of clone when the repo doesn't even exist yet. Bound to be easier than the administrative hassle you spoke of in your other email... -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Clone to an SSH destination
On Mon, Sep 3, 2012 at 7:38 PM, Konstantin Khomoutov flatw...@users.sourceforge.net wrote: On Mon, 3 Sep 2012 14:07:48 +0100 (BST) Mark Hills mark.hi...@framestore.com wrote: [...] But I'm actually more curious about why you need this in the first place, there's a bunch of devs where I work as well, but they never have the need to create new repos on some NFS drive in this manner. Without a command-line onto the filesystem (either local or NFS), how do you create a new repository for a new project? We have a fairly large team on a diverse set of projects. Projects come and go, so it's a burden if the administrator is needed just to create repos. Likewise, it's a step backwards for the developer to need to login themselves over SSH -- whereas 'git clone' is so easy to NFS. What are your devs doing when they do clone their current working directory to some NFS location, maybe there's a better way to do it. Most projects start as a small test at some point; eg. mkdir xx cd xx git init write some code git commit ... When a project becomes more official, the developer clones to a central location; eg. git clone --bare . /net/git/xx.git This is the step that is inconvenient if only SSH access is available. Well, then it looks you want something like github. In this case look at some more integrated solution such as Gitlab [1] -- I did not try it, but it looks like you import your users there and then they can log in, add their SSH keys and create their projects. Anything web based would be even more overhead than a simple: ssh server git init --bare foo/bar.git git push --mirror ssh://git/~/foo/bar.git Gitolite of course is even closer, as we discussed earlier. I also think gitolite has some way to actually use regular SSH users (or even users coming from a web server which is a front-end for Smart HTTP Git transport, doing its own authentication). This is explained in [2], and I hope Sitaram could provide more insight on setting things up this way, if needed (I did not use this feature). As I said earlier, regardless of how he does it, authentication will change, since he is no longer using a local (well, locally mounted) file system as the server. That may be get everyone to send us a pub key or give everyone an http password and use smart http. In addition, if they choose smart http, they *have to* use gitolite. Unlike ssh, where that two command sequence above would do it all for them, there is no eqvt if your git server is behind http. 1. http://gitlabhq.com/ 2. http://sitaramc.github.com/gitolite/g2/auth.html -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: receive.denyNonNonFastForwards not denying force update
On Tue, Aug 21, 2012 at 6:52 AM, Junio C Hamano gits...@pobox.com wrote: Sitaram Chamarty sitar...@gmail.com writes: On Mon, Aug 20, 2012 at 10:35 PM, Junio C Hamano gits...@pobox.com wrote: John Arthorne arthorne.ecli...@gmail.com writes: For all the details see this bugzilla, particularly comment #59 where we finally narrowed this down: https://bugs.eclipse.org/bugs/show_bug.cgi?id=343150 What does at the system level in your does *not* work at the system level. exactly mean? git config --system receive.denynonfastforwards true is not honored. At all. (And I checked there was nothing overriding it). --global does work (is honored). Tested on 1.7.11 Thanks, and interesting. Uggh. My fault this one. I had a very tight umask on root, and running 'git config --system' created an /etc/gitconfig that was not readable by a normal user. Running strace clued me in... John: maybe it's as simple as that in your case too. Junio/Brandon/Jeff: sorry for the false corroboration of John's report! -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: need help with syncing two bare repos
On Fri, Aug 3, 2012 at 11:59 PM, Eugene Sajine eugu...@gmail.com wrote: Hi, Could somebody please advise about how to address the following: I have a bare repo (bareA) on one server in network1 and i have a mirror of it on another server (bareB) in network2 BareB is updated periodically - no problem here If bareA dies users are supposed to move seamlessly to bareB. When bareA goes back up users are moved back but before it starts serving repos (before git-daemon starts) it updates from bareB. Now the problem i have is if bareA doesn't actually die, but the connection between two networks drops. In this case users from network2 will stop seeing bareA, they will start working with bareB, while users in netwrok1 will continue to work with bareA. What would be the best way of syncing the bareB back to bareA when connection is restored? I think the best variant would be to do something like: $ git pull --rebase /refs/heads/*:/refs/heads/* $ git push origin /refs/heads/*:/refs/heads/* but pull will not work on bare repos as i understand and there might be conflicts that will lead to unknown (for me) state of bare repos May be I'm looking into wrong direction? May be simple two way rsync will do the job? But I'm a bit reluctant to rely on rsync because I'm afraid it may screw up the repository information. The way I solve this is to insist that I *manually* specify which is the push destination and not make it automated. The other one is a read-only mirror until something is manually switched. Might not be ideal for every situation; your call... -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: GIT smart http vs GIT over ssh
On Tue, Jul 31, 2012 at 2:50 PM, Michael J Gruber g...@drmicha.warpmail.net wrote: vishwajeet singh venit, vidit, dixit 31.07.2012 11:04: On Tue, Jul 31, 2012 at 2:17 PM, Michael J Gruber g...@drmicha.warpmail.net wrote: vishwajeet singh venit, vidit, dixit 31.07.2012 05:19: On Tue, Jul 31, 2012 at 8:40 AM, Konstantin Khomoutov kostix+...@007spb.ru wrote: On Tue, Jul 31, 2012 at 08:36:07AM +0530, vishwajeet singh wrote: Just wanted to know the difference between smart http and ssh and in what scenarios we need them I am setting up a git server, can I just do with smart http support or I need to enable the ssh support to use git effectively. As I understand github provides both the protocols, what's the reason for supporting both protocols. http://git-scm.com/book/en/Git-on-the-Server-The-Protocols http://git-scm.com/2010/03/04/smart-http.html Thanks for the links, I have already gone through those links, was looking from implementation perspective do I really need to support both protocols on my server or I can just do with smart http and what's the preferred way of doing it smart http or ssh You need to provide what your users demand ;) Seriously, this is why GitHub and other providers offer both. Not only are some users more comfortable with one protocol or the other (Win users don't prefer ssh generally) but some might be able to use only one because of firewalls or corporate rules. From the server perspective, the setup is completely different, of course. Do you have shell accounts already which you want to reuse for ssh+git? Do you prefer setting up a special purpose shell account (gitosis/gitolite) or setting up a web server with authentication? I already have server setup with smart http backend, was just wondering if my users would really need ssh support or not and I agree to your point it should be based on user demand. While going through the git book I encountered a very tall claim about smart http I think this will become the standard Git protocol in the very near future. I believe this because it's both efficient and can be run either secure and authenticated (https) or open and unauthenticated (http). It also has the huge advantage that most firewalls have those ports (80 and 443) open already and normal users don't have to deal with ssh-keygen and the like. Once most clients have updated to at least v1.6.6, http will have a big place in the Git world. http://git-scm.com/2010/03/04/smart-http.html Just based on above comment in book I thought if smart http is way to go for future why to take hassle of setting up ssh. There is no need to set up ssh if smart http does the job for you. I don't think it makes a difference performance-wise on the server (upload-pack vs. http-backend) but others are more proficient in this area. I'm sure ssh+git is there to stay, it is just ordinary anonymous git protocol tunneled through ssh. So, it's as future-proof as git is. I was planning to use gitosis as I have python background and code looks not being maintained from quite sometime, which worries me a bit, I stumbled upon gitomatic https://github.com/jorgeecardona/gitomatic, has anyone any prior experience No idea about gitomatic. It looks a bit like gitolite in python (alpha) but doesn't say much about it's ancestry. There's also gitolite which is actively maintained and used. Basically, it's gitosis in perl. Sitaram, forgive me ;) oh that's quite alright. People forget that gitolite was, for the first 3 days of its life, called gitosis-lite :) -- Sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Enhancements to git-protocoll
On Mon, Jul 30, 2012 at 11:58 AM, Junio C Hamano gits...@pobox.com wrote: Heh. While I do not particularly consider auto-creation-upon-push a useful thing to begin with (after all, once you created a repository, you would want ways to manage it, setting up ACL for it [side point] these things are managed with templates of ACL rules that map access to roles, and the owner (the guy who created it) defines users for each role. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Enhancements to git-protocoll
On Mon, Jul 30, 2012 at 11:58 AM, Junio C Hamano gits...@pobox.com wrote: Shawn Pearce spea...@spearce.org writes: The way to expose the extra information parsed by Git to the server side could be made into calling out to hooks, and at that point, gitolite would not even have to know about the pack protocol. Good point. The case that spawned this thread however still has a problem with this approach. gitolite would need to create a repository to invoke the receive-pack process within, and install that new hook script into... when the hook was trying to prevent the creation of that repository in the first place. Heh. While I do not particularly consider auto-creation-upon-push a useful thing to begin with (after all, once you created a repository, you would want ways to manage it, setting up ACL for it and stuff like that, so adding a create command to the management interface suite would be a more natural direction to go), as long as we are discussing a hack that involves hooks, I do not think your observation is a show-stopper downside. The hook can interact with the end user over the back channel and decide to abort the transaction, while leaving some clue in the repository that is pre-agreed between the gitlite server and the hook. When gitolite culls the process with wait4(2), it could notice that clue, read the wish of the hook that the repository needs to be removed from it, and remove the repository. Up to that point, there is no real data transferred, so there isn't much wasted time or network resource anyway. An ancient Git would abort hard if passed this flag. An updated Git could set environment variables before calling hooks, making the arguments visible that way. And gitolite can still scrape what it needs from the command line without having to muck about inside of the protocol, but only if it needs to observe this new data from pusher to pushee? I do not think the details of how the extra information is passed via the Git at the receiving end to its surrounding matters that much. It would even work fine if we let the hook to talk with the end user sitting at the git push end, by using two extra sidebands to throw bits between them, while the Git process that spawned the hook acts as a relay, to establish a custom bi-di conversation (but again, I do not think it is useful, because such an out of band conversation cannot affect the outcome of the main protocol exchange in a meaningful way other than aborting). Or you could export environment variables, which would be far more limiting with respect to the nature of the data (i.e. needs to be free of NUL) and the size of data you can pass. The limitation may actually be a feature to discourage people from doing wacky things, though. `git push -Rfoo=baz host:dest.git master` on the client would turn into `git-receive-pack -Rfoo=baz dest.git` in the SSH and git:// command line, and cause GIT_PUSH_ARG_FOO=baz to appear in the environment of hooks. Over smart HTTP requests would get an additional query parameter of foo=baz. I think using the same extra args on the command line would be a good way to upgrade the protocol version in a way the current capability system does not allow us to (namely, stop the party that accepts the connection from immediately advertising its refs). More importantly from gitolite's point of view, this is the only way gitolite can see those variables in some situations, because gitolite runs *before* git, (and then again later via the update hook for pushes). The other hacky idea I had was to use a fake reference and have the client push a structured blob to that ref. The server would decode the blob, and deny the creation of the fake reference, but be able to get additional data from that blob. Its hacky, and I don't like making a new blob on the server just to transport a few small bits of data from the client. That way lies madness, and at that point, you are better off doing a proper protocol extension by registering a new capability and defining the semantics for it. -- Sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Enhancements to git-protocoll
On Sun, Jul 29, 2012 at 3:11 AM, Fredrik Gustafsson iv...@iveqy.com wrote: Hi, sometimes git communicates with something that's not git on the other side (gitolite and github for example). Sometimes the server wants to communicate directly to the git user. git isn't really designed for this. gitolite solves this by do user interaction on STDERR instead. The bad thing about this is that it can only be one-direction communication, for example error messages. If git would allow for the user to interact direct with the server, a lot of cool and and userfriendly features could be developed. For example: gitolite has something called wild repos[1]. The management is cumbersome and if you misspell when you clone a repo you might instead create a new repo. For the record, although it cannot do the yes/no part, if you want to disable auto-creation on a fetch/clone (read operation) it's trivial to add a PRE_CREATE trigger to do that. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html