Re: Terminology
> So what's the converse of "fetch" (to rename git-ssh-push to)? > Maybe "ship"? The opposite of "fetch" is "throw" or "toss". (Just avoid tossing cookies or off.) - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Terminology
Hi, On Fri, 5 Aug 2005, [EMAIL PROTECTED] wrote: > On Fri, 5 Aug 2005, Linus Torvalds wrote: > > > On Fri, 5 Aug 2005, Johannes Schindelin wrote: > > > > > - The files under $GIT_DIR/refs record object names, and are > > > called "refs". What is under refs/heads/ are called "heads", > > > refs/tags/ "tags". Typically, they are either object names > > > of commit objects or tag objects that resolve to commit > > > objects, but a tag can point at any object. > > > > > > The tutorial never calls them "refs", but instead "references". > > > > It might be worth saying explicitly that a reference is nothing but the > > same thing as a "object name" aka "sha1". > > Well, it's an object name stored in a file. This adds a layer of > indirection and a meaningful name. Yes. > > So I'd vote for making the suggested definition official: "fetch" means > > fetching the data, and "pull" means "fetch + merge". > > So what's the converse of "fetch" (to rename git-ssh-push to)? > Maybe "ship"? I actually like "push". You know, not everybody agrees that "push" is the opposite of "pull"... Ciao, Dscho - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Terminology
Hi, wow! What a long mail! But I probably deserved it, quoting that lengthy mail from Junio... On Fri, 5 Aug 2005, Linus Torvalds wrote: > On Fri, 5 Aug 2005, Johannes Schindelin wrote: > > > > Tutorial says "cache" aka "index". Though technically, a cache > > is the index file _plus_ the related objects in the object database. > > git-update-cache.txt even makes the difference between the "index" > > and the "directory cache". > > I think we should globally rename it to "index". Totally agree. The index is a central concept. But let's keep in mind -- and make future Documentation/ readers do the same -- that the index, without the referenced objects in the objects database, is only a skeleton. > The "directory cache" and later "cache" naming came from when I started > doing the work - before git was even git at all, and had no backing store > what-so-ever, I started out writing "cache.h" and "read-cache.c", and it > was really first a trial at doing a totally SCM-neutral directory cache > front-end. > > You don't even see that in the git revision history, because that was > before git was self-hosting - the project was partly started to also work > as possibly just a fast front-end to something that wasn't as fast (ie > think something like a front-end to make "monotone" work better). > > So the "directory cache" and "cache" naming comes from that historical > background: it was really started as a front-end cache, and in fact the > ".git" directory was called ".dircache" initially. You can see some of > that in the very earliest git releases: by then I had already done the > backing store, and the thing was already called "git", but the "dircache" > naming still remains in places. > > For example, here's my "backup" target in the initial checkin: > > backup: clean > cd .. ; tar czvf dircache.tar.gz dir-cache > > which shows that not only did I call the resulting tar file "dircache", > the directory I was developing stuff in was called "dir-cache" as well ;) > > The index obviously ended up doing a lot more, and especially with the > different stages it became much more than just a directory cache thing: > it's integral to how git does the fast part of a merge. So we should call > it "index" and edit out the old "cache" and "director cache" naming > entirely. I quoted this entirely, for a good reason: Linus, one day you really should write a Wikibook about all the "small" projects you started. I still remember the words "I'm doing a (free) operating system (just a hobby, won't be big...". There's so much to be learnt about good engineering. And people do want to add there anecdotes to it. > > - the directory which corresponds to the top of the hierarchy > > described in the index file; I've seen words like "working > > tree", "working directory", "work tree" used. > > > > The tutorial initially says "working tree", but then "working > > directory". Usually, a directory does not include its > > subdirectories, though. git-apply-patch-script.txt, git-apply.txt, > > git-hash-object.txt, git-read-tree.txt > > use "work tree". git-checkout-cache.txt, git-commit-tree.txt, > > git-diff-cache.txt, git-ls-tree.txt, git-update-cache.txt contain > > "working directory". git-diff-files.txt talks about a "working tree". > > I think we should use "working tree" throughout, since "working directory" > is unix-speak for "pwd" and has a totally different meaning. I hoped so much. > > - An index file can be in "merged" or "unmerged" state. The > > former is when it does not have anything but stage 0 entries, > > the latter otherwise. > > I think the "unmerged" case should be mentioned in the "cache entry" > thing, since it's really a per-entry state, exactly like "dirty/clean". > > Then, explaining a "unmerged index" as being an index file with some > entries being unmerged makes more sense. > > As it is, the above "explains" an index file as being unmerged by talking > about "stage 0 entries", which in turn haven't been explained at all. That's right. We probably should copy a bit from git-read-tree.txt, or at least reference it in the glossary. > > - A "tree object" can be recorded as a part of a "commit > > object". The tree object is said to be "associated with" the > > commit object. > > > > In diffcore.txt, "changeset" is used in place of "commit". > > We really should use "commit" throughout. ex-BK users sometimes lip into > "changeset" (which in turn is probably because BK had these per-file > commits too - deltas), but there's no point in the distinction in git. A > commit is a commit. That is, if you don't do "git-update-cache " (which is not possible with some porcelains). Apart from that: I think that it is quite important to make the distinction between a "commit" and a "commit object". Newbies (in that case, people working with CVS are newbies to the concepts of git, too) tend understand better what yo
Re: Terminology
On Fri, 5 Aug 2005, Linus Torvalds wrote: > On Fri, 5 Aug 2005, Johannes Schindelin wrote: > > > - The files under $GIT_DIR/refs record object names, and are > > called "refs". What is under refs/heads/ are called "heads", > > refs/tags/ "tags". Typically, they are either object names > > of commit objects or tag objects that resolve to commit > > objects, but a tag can point at any object. > > > > The tutorial never calls them "refs", but instead "references". > > It might be worth saying explicitly that a reference is nothing but the > same thing as a "object name" aka "sha1". Well, it's an object name stored in a file. This adds a layer of indirection and a meaningful name. > So I'd vote for making the suggested definition official: "fetch" means > fetching the data, and "pull" means "fetch + merge". So what's the converse of "fetch" (to rename git-ssh-push to)? Maybe "ship"? -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Terminology
On Fri, 5 Aug 2005, Johannes Schindelin wrote: > > Tutorial says "cache" aka "index". Though technically, a cache > is the index file _plus_ the related objects in the object database. > git-update-cache.txt even makes the difference between the "index" > and the "directory cache". I think we should globally rename it to "index". The "directory cache" and later "cache" naming came from when I started doing the work - before git was even git at all, and had no backing store what-so-ever, I started out writing "cache.h" and "read-cache.c", and it was really first a trial at doing a totally SCM-neutral directory cache front-end. You don't even see that in the git revision history, because that was before git was self-hosting - the project was partly started to also work as possibly just a fast front-end to something that wasn't as fast (ie think something like a front-end to make "monotone" work better). So the "directory cache" and "cache" naming comes from that historical background: it was really started as a front-end cache, and in fact the ".git" directory was called ".dircache" initially. You can see some of that in the very earliest git releases: by then I had already done the backing store, and the thing was already called "git", but the "dircache" naming still remains in places. For example, here's my "backup" target in the initial checkin: backup: clean cd .. ; tar czvf dircache.tar.gz dir-cache which shows that not only did I call the resulting tar file "dircache", the directory I was developing stuff in was called "dir-cache" as well ;) The index obviously ended up doing a lot more, and especially with the different stages it became much more than just a directory cache thing: it's integral to how git does the fast part of a merge. So we should call it "index" and edit out the old "cache" and "director cache" naming entirely. > - the directory which corresponds to the top of the hierarchy > described in the index file; I've seen words like "working > tree", "working directory", "work tree" used. > > The tutorial initially says "working tree", but then "working > directory". Usually, a directory does not include its > subdirectories, though. git-apply-patch-script.txt, git-apply.txt, > git-hash-object.txt, git-read-tree.txt > use "work tree". git-checkout-cache.txt, git-commit-tree.txt, > git-diff-cache.txt, git-ls-tree.txt, git-update-cache.txt contain > "working directory". git-diff-files.txt talks about a "working tree". I think we should use "working tree" throughout, since "working directory" is unix-speak for "pwd" and has a totally different meaning. > - When the stat information a cache entry records matches what > is in the work tree, the entry is called "clean" or > "up-to-date". The opposite is "dirty" or "not up-to-date". > > - An index file can be in "merged" or "unmerged" state. The > former is when it does not have anything but stage 0 entries, > the latter otherwise. I think the "unmerged" case should be mentioned in the "cache entry" thing, since it's really a per-entry state, exactly like "dirty/clean". Then, explaining a "unmerged index" as being an index file with some entries being unmerged makes more sense. As it is, the above "explains" an index file as being unmerged by talking about "stage 0 entries", which in turn haven't been explained at all. > - A "tree object" can be recorded as a part of a "commit > object". The tree object is said to be "associated with" the > commit object. > > In diffcore.txt, "changeset" is used in place of "commit". We really should use "commit" throughout. ex-BK users sometimes lip into "changeset" (which in turn is probably because BK had these per-file commits too - deltas), but there's no point in the distinction in git. A commit is a commit. > - The following objects are collectively called "tree-ish": a > tree object, a commit object, a tag object that resolves to > either a commit or a tree object, and can be given to > commands that expect to work on a tree object. > > We could call this category an "ent". LOL. You are a total geek. > - The files under $GIT_DIR/refs record object names, and are > called "refs". What is under refs/heads/ are called "heads", > refs/tags/ "tags". Typically, they are either object names > of commit objects or tag objects that resolve to commit > objects, but a tag can point at any object. > > The tutorial never calls them "refs", but instead "references". It might be worth saying explicitly that a reference is nothing but the same thing as a "object name" aka "sha1". And make it very clear that it can point to any object type, although commits tend to be the most common thng you want to reference. That then leads naturally into a very specific _subcase_ of refs, namely a "head": > - A "head" is always an object name of a commit, and marks the > latest comm
Re: Terminology
Hi, I am finally finished with my preliminary survey: I took what you sent as a strawman, and inserted what I found (I tried to say only something about ambiguous naming): - The unit of storage in GIT is called "object"; no other word is used and the word "object" is used only for this purpose so this one is OK. - A 20-byte SHA1 to uniquely identify "objects"; README and early Linus messages call this "object name" so does tutorial. Many places say "object SHA1" or just "SHA1". "Object" is short for "immutable object". git-cat-file.txt says "repository object". - An "object database" stores a set of "objects", and an individial object can be retrieved by giving it its object name. Tutorial calls it an "object store". git-fsck-cache.txt names it "database" at first, but then also uses "object pool". - Storing a regular file or a symlink in the object database results in a "blob object" created. You cannot directly store filesystem directory, but a collection of blob objects and other tree objects can be recorded as a "tree object" which corresponds to this notion. - $GIT_INDEX_FILE is "index file", which is a collection of "cache entries". The former is sometimes called "cache file", the latter just "cache". Tutorial says "cache" aka "index". Though technically, a cache is the index file _plus_ the related objects in the object database. git-update-cache.txt even makes the difference between the "index" and the "directory cache". - the directory which corresponds to the top of the hierarchy described in the index file; I've seen words like "working tree", "working directory", "work tree" used. The tutorial initially says "working tree", but then "working directory". Usually, a directory does not include its subdirectories, though. git-apply-patch-script.txt, git-apply.txt, git-hash-object.txt, git-read-tree.txt use "work tree". git-checkout-cache.txt, git-commit-tree.txt, git-diff-cache.txt, git-ls-tree.txt, git-update-cache.txt contain "working directory". git-diff-files.txt talks about a "working tree". - When the stat information a cache entry records matches what is in the work tree, the entry is called "clean" or "up-to-date". The opposite is "dirty" or "not up-to-date". - An index file can be in "merged" or "unmerged" state. The former is when it does not have anything but stage 0 entries, the latter otherwise. That seems to be unambiguous (sometimes it's called "index", sometimes "index file"; I don't think that matters). - An merged index file can be written as a "tree object", which is technically a set of interconnected tree objects but we equate it with the toplevel tree object with this set. - A "tree object" can be recorded as a part of a "commit object". The tree object is said to be "associated with" the commit object. In diffcore.txt, "changeset" is used in place of "commit". - A "tag object" can be recorded as a pointer to another object of any type. The act of following the pointer a tag object holds (this can go recursively) until we get to a non-tag object is sometimes called "resolving the tag". - The following objects are collectively called "tree-ish": a tree object, a commit object, a tag object that resolves to either a commit or a tree object, and can be given to commands that expect to work on a tree object. We could call this category an "ent". - The files under $GIT_DIR/refs record object names, and are called "refs". What is under refs/heads/ are called "heads", refs/tags/ "tags". Typically, they are either object names of commit objects or tag objects that resolve to commit objects, but a tag can point at any object. The tutorial never calls them "refs", but instead "references". - A "head" is always an object name of a commit, and marks the latest commit in one line of development. A line of development is often called a "branch". We sometimes use the word "branch head" to stress the fact that we are talking about a single commit that is the latest one in a "branch". In the tutorial, the latter is used in reverse: it talks about a "HEAD development branch" and a "HEAD branch". I find it a little bit troublesome that $GIT_DIR/branches does not really refer to a branch, but rather to a (possibly remote) repository. - Combining the states from more than one lines of developments is called "merging" and typically done between two branch heads. This is called "resolving" in the tutorial and there is git-resolve-script command for it. - A set of "refs" with the set of objects reachable from them constitute a "repository". Although currently there is no provision for a repository to say that its objects are stored in this and that object database, multiple repositories can share the same object database, and there is not a conceptual limit that a repository must retrive its objects from a sin
Re: Terminology
Hi, I tried to avoid the work. But I'll do it. Ciao, Dscho - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Terminology
Johannes Schindelin <[EMAIL PROTECTED]> writes: > Maybe we should decide on a common terminology before kicking out 1.0, and > look through all files in Documentation/ to have a consistent vocabulary. > And poor me does not get confused no more. Glad to see you started the discussion on this one. I have a slight worry and suspicion that this might open a can of worms, but I agree we need to get this done. We probably would end up spliting the Terminology section in Documentation/git.txt into a separate "Glossary" document. Care to volunteer drafting a strawman, listing the concepts we need terms for, marking the ones we seem to use the same word for? You do not have to suggest which candidate term to use for all of them. Something along these lines... - The unit of storage in GIT is called "object"; no other word is used and the word "object" is used only for this purpose so this one is OK. - A 20-byte SHA1 to uniquely identify "objects"; README and early Linus messages call this "object name" so does tutorial. Many places say "object SHA1" or just "SHA1". - An "object database" stores a set of "objects", and an individial object can be retrieved by giving it its object name. - Storing a regular file or a symlink in the object database results in a "blob object" created. You cannot directly store filesystem directory, but a collection of blob objects and other tree objects can be recorded as a "tree object" which corresponds to this notion. - $GIT_INDEX_FILE is "index file", which is a collection of "cache entries". The former is sometimes called "cache file", the latter just "cache". - the directory which corresponds to the top of the hierarchy described in the index file; I've seen words like "working tree", "working directory", "work tree" used. - When the stat information a cache entry records matches what is in the work tree, the entry is called "clean" or "up-to-date". The opposite is "dirty" or "not up-to-date". - An index file can be in "merged" or "unmerged" state. The former is when it does not have anything but stage 0 entries, the latter otherwise. - An merged index file can be written as a "tree object", which is technically a set of interconnected tree objects but we equate it with the toplevel tree object with this set. - A "tree object" can be recorded as a part of a "commit object". The tree object is said to be "associated with" the commit object. - A "tag object" can be recorded as a pointer to another object of any type. The act of following the pointer a tag object holds (this can go recursively) until we get to a non-tag object is sometimes called "resolving the tag". - The following objects are collectively called "tree-ish": a tree object, a commit object, a tag object that resolves to either a commit or a tree object, and can be given to commands that expect to work on a tree object. - The files under $GIT_DIR/refs record object names, and are called "refs". What is under refs/heads/ are called "heads", refs/tags/ "tags". Typically, they are either object names of commit objects or tag objects that resolve to commit objects, but a tag can point at any object. - A "head" is always an object name of a commit, and marks the latest commit in one line of development. A line of development is often called a "branch". We sometimes use the word "branch head" to stress the fact that we are talking about a single commit that is the latest one in a "branch". - Combining the states from more than one lines of developments is called "merging" and typically done between two branch heads. This is called "resolving" in the tutorial and there is git-resolve-script command for it. - A set of "refs" with the set of objects reachable from them constitute a "repository". Although currently there is no provision for a repository to say that its objects are stored in this and that object database, multiple repositories can share the same object database, and there is not a conceptual limit that a repository must retrive its objects from a single object database. - The act of finding out the object names recorded in "refs" a different repository records, optionally updating a local "refs" with their values, and retrieving the objects reachable from them is called "fetching". Fetching immediately followed by merging is called "pulling". - The act of updating "refs" in a different repository with new value and populating the object database(s) associated with the repository is called "pushing". - Currently refs/heads records branch heads of both locally created branches and branches fetched from other repositories. - Currently, fetching always happen against a single branch head on a remote repository, and (a remote repository, name of the branch) is stored in $GIT_DIR/branches/ as a