Re: First stab at glossary

Daniel Barkalow Wed, 17 Aug 2005 12:09:44 -0700

On Wed, 17 Aug 2005, Johannes Schindelin wrote:

> Hi,
>
> long, long time. Here?s my first stab at the glossary, attached the
> alphabetically sorted, asciidoc marked up txt file (Comments?
> Suggestions? Pizzas?):
>
> object::
>       The unit of storage in GIT. It is uniquely identified by
>       the SHA1 of its contents. Consequently, an object can not
>       be changed.
>
> SHA1::
>       A 20-byte sequence (or 41-byte file containing the hex
>       representation and a newline). It is calculated from the
>       contents of an object by the Secure Hash Algorithm 1.


It's also often 40-character string (with whatever termination) in places
like commit objects, tag objects, command-line arguments, listings, and so
forth.

> object database::
>       Stores a set of "objects", and an individial object is identified
>       by its SHA1 (its ref). The objects are either stored as single
>       files, or live inside of packs.
>
> object name::
>       Synonym for SHA1.

Have we killed the use of the third term "hash" for this? I'd say that
"object name" is the standard term, and "SHA1" is a nickname, if only
because "object name" is more descriptive of the particular use of the
term.

> blob object::
>       Untyped object, i.e. the contents of a file.

This "i.e." should be "e.g.", since symlink targets are also stored as
blobs, and any other bulk data stored by itself would be. (IIRC, Junio has
a tagged blob to hold his public key, for example)

> tree object::
>       An object containing a list of blob and/or tree objects.
>       (A tree usually corresponds to a directory without
>       subdirectories).
>
> tree::
>       Either a working tree, or a tree object together with the
>       dependent blob and tree objects (i.e. a stored representation
>       of a working tree).
>
> cache::
>       A collection of files whose contents are stored as objects.
>       The cache is a stored version of your working tree. Well, can
>       also contain a second, and even a third version of a working
>       tree, which are used when merging.
>
> cache entry::
>       The information regarding a particular file, stored in the index.
>       A cache entry can be unmerged, if a merge was started, but not
>       yet finished (i.e. if the cache contains multiple versions of
>       that file).
>
> index::
>       Contains information about the cache contents, in particular
>       timestamps and mode flags ("stat information") for the files
>       stored in the cache. An unmerged index is an index which contains
>       unmerged cache entries.

I think we might want to entirely kill the "cache" term, and talk only
about the "index" and "index entries". Of course, a bunch of the code will
have to be renamed to make this completely successful, but we could change
the glossary and documentation, and mention "cache" and "cache entry" as
old names for "index" and "index entry" respectively.

> working tree::
>       The set of files and directories currently being worked on.
>       Think "ls -laR"

This is where the data is actually in the filesystem, and you can edit and
compile it (as opposed to a tree object or the index, which semantically
have the same contents, but aren't presented in the filesystem that way).

> directory::
>       The list you get with "ls" :-)
>
> checkout::
>       The action of updating the working tree to a revision which was
>       stored in the object database.

Move after "revision"?

> revision::
>       A particular state of files and directories which was stored in
>       the object database. It is referenced by a commit object.
>
> commit::
>       The action of storing the current state of the cache in the
>       object database. The result is a revision.
>
> commit object::
>       An object which contains the information about a particular
>       revision, such as parents, committer, author, date and the
>       tree object which corresponds to the top directory of the
>       stored revision.

Move "parent" around here.

> changeset::
>       BitKeeper/cvsps speak for "commit". Since git does not store
>       changes, but states, it really does not make sense to use
>       the term "changesets" with git.
>
> ent::
>       Favorite synonym to "tree-ish" by some total geeks.

Move after "tree-ish".

> head::
>       The top of a branch. It contains a ref to the corresponding
>       commit object.
>
> branch::
>       A non-cyclical graph of revisions, i.e. the complete history of
>       a particular revision, which does not (yet) have children, which
>       is called the branch head. The branch heads are stored in
>       $GIT_DIR/refs/heads/.

A branch head might have children, if they're in another branch. (E.g., I
pull mainline, make a new branch based on it, and commit a change; the
head of mainline is still a branch head, even though it's the parent of my
new commit, because my new commit isn't in mainline.)

> ref::
>       A 40-byte hex representation of a SHA1 pointing to a particular
>       object. These are stored in $GIT_DIR/refs/.
>
> head ref::
>       A ref pointing to a head. Often, this is abbreviated to "head".
>       Head refs are stored in $GIT_DIR/refs/heads/.
>
> tree-ish::
>       A ref pointing to either a commit object, a tree object, or a
>       tag object pointing to a commit or tree object.
>
> tag object::
>       An object containing a ref pointing to another object. It can
>       contain a (PGP) signature, in which case it is called "signed
>       tag object".
>
> tag::
>       A ref pointing to a tag or commit object. In contrast to a head,
>       a tag is not changed by a commit. Tags (not tag objects) are
>       stored in $GIT_DIR/refs/tags/. A git tag has nothing to do with
>       a Lisp tag (which is called object type in git's context).

As above, only the head for the branch being committed to is changed by a
commit. A tag, not being the head of a branch, is therefore never changed
by a commit.

> merge::
>       To merge branches means to try to accumulate the changes since a
>       common ancestor and apply them to the first branch. An automatic
>       merge uses heuristics to accomplish that. Evidently, an automatic
>       merge can fail.
>
> resolve::
>       The action of fixing up manually what a failed automatic merge
>       left behind.

"Resolve" is also used for the automatic case (e.g., in
"git-resolve-script", which goes from having two commits and a message to
having a new commit). I'm not sure what the distinction is supposed to be.

        -Daniel
*This .sig left intentionally blank*

Re: First stab at glossary

Reply via email to