Symlinks [was Re: Storing permissions]
There's one more mode bit we might actually care about: the symlink bit. (One would store the target as the blob, presumably, but chmod isn't going to create symlinks out of regular files.) Morten - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing permissions
Linus Torvalds wrote: On Sun, 17 Apr 2005, David A. Wheeler wrote: There's a minor reason to write out ALL the perm bit data, but only care about a few bits coming back in: Some people use SCM systems as a generalized backup system Yes. I was actually thinking about having system config files in a git repository when I started it, since I noticed how nicely it would do exactly that. However, since the mode bits also end up being part of the name of the tree object (ie they are most certainly part of the hash), it's really basically impossible to only care about one bit but writing out many bits: it's the same issue of having multiple "identical" blocks with different names. ... One solution is to tell git with a command line flag and/or config file entry that "for this repo, I want you to honor all bits". That should be easy enough to add at some point, and then you really get what you want. Yes, I thought of that too. And I agree, that should do the job. My real concern is I'm looking at the early design of the storage format so that it's POSSIBLE to extend git in obvious ways. As long as it's possible later, then that's a great thing. ... Also, I made a design decision that git only cares about non-dotfiles. Git literally never sees or looks at _anything_ that starts with a ".". I think that's absolutely the right thing to do for an SCM (if you hide your files, I really don't think you should expect the SCM to see it), but it's obviously not the right thing for a backup thing. Again, a command line flag or config file entry could change that in the future, if desired. So this is a decision that could be changed later... the best kind of decision :-). --- David A. Wheeler - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing permissions
On Sun, 17 Apr 2005, David A. Wheeler wrote: > > There's a minor reason to write out ALL the perm bit data, but > only care about a few bits coming back in: Some people use > SCM systems as a generalized backup system Yes. I was actually thinking about having system config files in a git repository when I started it, since I noticed how nicely it would do exactly that. However, since the mode bits also end up being part of the name of the tree object (ie they are most certainly part of the hash), it's really basically impossible to only care about one bit but writing out many bits: it's the same issue of having multiple "identical" blocks with different names. It's ok if it happens occasionally (it _will_ happen at the point of a tree conversion to the new format, for example), but it's not ok if it happens all the time - which it would, since some people have umask 002 (and individual groups) and others have umask 022 (and shared groups), and I can imagine that some anal people have umask 0077 ("I don't want to play with others"). The trees would constantly bounce between a million different combinations (since _some_ files would be checked out with the "other" mode). At least if you always honor umask or always totally ignore umask, you get a nice repetable thing. We tried the "always ignore" umask thing, and the problem with that is that while _git_ ended up always doing a "fchmod()" to reset the whole permission mask, anybody who created files any other way and then checked them in would end up using umask. One solution is to tell git with a command line flag and/or config file entry that "for this repo, I want you to honor all bits". That should be easy enough to add at some point, and then you really get what you want. That said, git won't be really good at doing system backup. I actually _do_ save a full 32-bit of "mode" (hey, you could have "immutable" bits etc set), but anybody who does anything fancy at all with mtime would be screwed, for example. Also, right now we don't actually save any other type of file than regular/directory, so you'd have to come up with a good save-format for symlinks (easy, I guess - just make a "link" blob) and device nodes (that one probably should be saved in the "cache_entry" itself, possibly encoded where the sha1 hash normally is). Also, I made a design decision that git only cares about non-dotfiles. Git literally never sees or looks at _anything_ that starts with a ".". I think that's absolutely the right thing to do for an SCM (if you hide your files, I really don't think you should expect the SCM to see it), but it's obviously not the right thing for a backup thing. (It _might_ be the right thing for a system config file, though, eg tracking something like "/etc" with git might be ok, modulo the other issues). Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing permissions
On Sun, 17 Apr 2005, David A. Wheeler wrote: > There's a minor reason to write out ALL the perm bit data, but > only care about a few bits coming back in: Some people use > SCM systems as a generalized backup system, so you can back up > your system to an arbitrary known state in the past > (e.g., "Change my /etc files to the state I was at > just before I installed that &*#@ program!"). > For more on this, see: > http://www.onlamp.com/pub/a/onlamp/2005/01/06/svn_homedir.html > > If you store all the bits, then you CAN restore things > more exactly the way they were. This is imperfect, since > it doesn't cover more exotic permission > values from SELinux, xattrs, whatever. For some, that's enough. I think this should be possible with a different tag than "tree". All the bits aren't sufficient, anyway; the unincluded values include the user and group, which are likely to matter for some things in /etc. But there's no reason that the core can't support both a system-local complete representation of the dentry and a user-relative representation of a source distribution with different tags. For that matter, it could accept "dir" objects in commits as well, and use version-control-type logic on history while refusing to do non-sensical things with them. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing permissions
David wrote: > There's a minor reason to write out ALL the perm bit data, but There's always the 'configurable option' approach. Someone, I doubt Linus will have any interest in it, could volunteer to make the masks of st_mode, used when storing and recovering file permissions, be configurable by some environment variable settings, which default to whatever Linus provided. But, in general, if you want a generalized backup system, git is not it. Git skips all files whose name begins with the dot '.' character, and anything that is not a regular file or directory. Git makes no concessions to working adequately on file systems lacking normal inode numbers (such as smb, fat, vfat). Git obscures the archive format a modest amount, for pure speed and to encourage use only via appropriate wrappers. Git is tuned for blazing speed at the operations that Linus needs, not for trivial recovery, using the most basic tools, under harsh circumstances. The basic idea of using such an 'object database' (though I dislike that term -- too high falutin vague) of files stored by their hash is a good one. But a different core implementation is needed for backups. I have one that I use for my own backups, but it is written in Python, and uses MD5, one or the other of which likely disqualifies it from further consideration by half the readers of this list. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing permissions
Linus Torvalds wrote: On Sat, 16 Apr 2005, Paul Jackson wrote: Morten wrote: It makes some sense in principle, but without storing what they mean (i.e., group==?) it certainly makes no sense. There's no "they" there. I think Martin's proposal, to which I agreed, was to store a _single_ bit. If any of the execute permissions of the incoming file are set, then the bit is stored ON, else it is stored OFF. On 'checkout', if the bit is ON, then the file permission is set mode 0777 (modulo umask), else it is set mode 0666 (modulo umask). I think I agree. Anybody willing to send me a patch? One issue is that if done the obvious way it's an incompatible change, and old tree objects won't be valid any more. It might be ok to just change the "compare cache" check to only care about a few bits, though: S_IXUSR and S_IFDIR. There's a minor reason to write out ALL the perm bit data, but only care about a few bits coming back in: Some people use SCM systems as a generalized backup system, so you can back up your system to an arbitrary known state in the past (e.g., "Change my /etc files to the state I was at just before I installed that &*#@ program!"). For more on this, see: http://www.onlamp.com/pub/a/onlamp/2005/01/06/svn_homedir.html If you store all the bits, then you CAN restore things more exactly the way they were. This is imperfect, since it doesn't cover more exotic permission values from SELinux, xattrs, whatever. For some, that's enough. Yeah, I know, not the main purpose of git. But what the heck, I _like_ flexible infrastructures. --- David A. Wheeler - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing permissions
On Sat, 16 Apr 2005, Linus Torvalds wrote: > > Anybody want to send a patch to do this? Actually, I just did it. Seems to work for the only test-case I tried, namely I just committed it, and checked that the permissions all ended up being recorded as 0644 in the tree (if it has the -x bit set, they get recorded as 0755). When checking out, we always check out with 0666 or 0777, and just let umask do its thing. We only test bit 0100 when checking for differences. Maybe I missed some case, but this does indeed seem saner than the "try to restore all bits" case. If somebody sees any problems, please holler. (Btw, you may or may not need to blow away your "index" file by just re-creating it with a "read-tree" after you've updated to this. I _tried_ to make sure that the compare just ignored the ce_mode bits, but the fact is, your index file may be "corrupt" in the sense that it has permission sets that sparse expects to never generate in an index file any more..) Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing permissions
Linus wrote: > It might be ok to just change the "compare cache" check to only care > about a few bits, though: S_IXUSR and S_IFDIR. And then ... I think I agree. But since I am reluctant to take enough time to understand the code well enough to write this patch, I'll shut up now ;). -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing permissions
Paul Jackson wrote: Junio wrote: Sounds like svn I have no idea what svn is. svn = common abbreviation for "Subversion", a widely-used centralized SCM tool intentionally similar to CVS. --- David A. Wheeler - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing permissions
On Sat, 16 Apr 2005, Paul Jackson wrote: > > Morten wrote: > > It makes some sense in principle, but without storing what they mean > > (i.e., group==?) it certainly makes no sense. > > There's no "they" there. > > I think Martin's proposal, to which I agreed, was to store a _single_ > bit. If any of the execute permissions of the incoming file are set, > then the bit is stored ON, else it is stored OFF. On 'checkout', if the > bit is ON, then the file permission is set mode 0777 (modulo umask), > else it is set mode 0666 (modulo umask). I think I agree. Anybody willing to send me a patch? One issue is that if done the obvious way it's an incompatible change, and old tree objects won't be valid any more. It might be ok to just change the "compare cache" check to only care about a few bits, though: S_IXUSR and S_IFDIR. And then always write new "tree" objects out with mode set to one of - 04: we already do this for directories - 100644: normal files without S_IXUSR set - 100755: normal files _with_ S_IXUSR set Then, at compare time, we only look at S_IXUSR matching for files (we never compare directory modes anyway). And at file create time, we create them with 0666 and 0777 respectively, and let the users umask sort it out (and if the user has 0100 set in his umask, he can damn well blame himself). This would pretty much match the existing kernel tree, for example. We'd end up with some new trees there (and in git), but not a lot of incompatibility. And old trees would still work fine, they'd just get written out differently. Anybody want to send a patch to do this? Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing permissions
Morten wrote: > It makes some sense in principle, but without storing what they mean > (i.e., group==?) it certainly makes no sense. There's no "they" there. I think Martin's proposal, to which I agreed, was to store a _single_ bit. If any of the execute permissions of the incoming file are set, then the bit is stored ON, else it is stored OFF. On 'checkout', if the bit is ON, then the file permission is set mode 0777 (modulo umask), else it is set mode 0666 (modulo umask). You might disagree that this is a good idea, but it certainly does 'make sense' (as in 'is sensibly well defined'). > I suspect a non-readable file would cause a bit of a problem in the low-level > commands. Probably so. If someone sets their umask 0333 or less, then they are either fools or QA (software quality assurance, or test) engineers. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing permissions
> Does it really make sense to store full permissions in the trees? I think > that remembering the x-bit should be good enough for almost all purposes > and the other permissions should be left to the local environment. It makes some sense in principle, but without storing what they mean (i.e., group==?) it certainly makes no sense. It's a bit like unpacking a tar file. I suspect a non-readable file would cause a bit of a problem in the low-level commands. Morten - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing permissions
Junio wrote: > Sounds like svn I have no idea what svn is. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing permissions
> "PJ" == Paul Jackson <[EMAIL PROTECTED]> writes: PJ> That matches my experience - store 1 bit of mode state - executable or not. Sounds like svn ;-). - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing permissions
Martin wrote: > Does it really make sense to store full permissions in the trees? I think > that remembering the x-bit should be good enough for almost all purposes > and the other permissions should be left to the local environment. That matches my experience - store 1 bit of mode state - executable or not. Let local environment determine read, write and umask permissions. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html