On Fri, 7 Mar 2014 02:14:09 -0800 (PST)
tombert <tombert.gro...@live.at> wrote:

> I am doing a clone from https://github.com/tcltk/tcl and trying to
> create a changelog file from the git history.
> When doing:
> ># git log a45650c^..0fb4a39 --pretty="format:%h" --reverse > test.txt
> I get 10140 lines of commits.
> When doing
> ># git log 021bae7^..0fb4a39 --pretty="format:%h" --reverse >
> ># test2.txt
> I get 10123 lines of commits.
> Problems:
> 1) When examing the file test.txt the commit hash 021bbae7 is only
> one commit from a45650 away. So the commit line number should only
> differ by one, not 17 commits.
> 2) Both files should start with the <since> commit which they don't.

My take on this is that you're observing bogus history: the Tcl
project, at the time of these commits (2003) was maintained in CVS
and then switched to Fossil [4].  Note this fact well: I was
contemplating the conversion process, which had multiple attempts, and
also the wars about which DVCS to pick, which Git lost.  So... For a
start, this repo is not official; it seems to be maintained by the
github user "das" (which most probably is Dan Steffen--one of Tcl/Tk
gods), and I don't know whether he used a CVS to Git conversion tool or
Fossil to Git conversion tool, and how the synchronisaton of ongoing
developments in the blessed Fossil repo happens with this one.

In this particular case, observing the history near the two affecting
commits, a45650c and 021bae7, looks like it's *mostly* the same history,
but with totally different SHA-1 names (and sometimes slightly different
commit messages).  Hence, I'm 99% sure we're dealing here with
artefacts of CVS -> Git conversion process.

So I'd drop to the Tcler's chat [5] and ask das directly.

Now let me write a lot more prose in an attempt to deal with a
slightly tangential problem which jumped at me while I glanced over your
text.  The problem is that you probably maintain an idea about Git's
revision ranges different from that of Git itself.  This appears to be a
rather common confusion so I'll try to explain it.  The problem is that
many people try to think about the Git's repo history in terms of more
linear systems like Subversion in which the `svn log -rSINCE:UNTIL URL`
command means exactly this: show all commits which affected URL
starting at revision not before SINCE and ending at revision not after

Contrary to this, in Git revision ranges make `git log` anchor its
operation at a certain commit or commits and then possibly prune certain
parts of the graph of commits reachable from these anchor commits.
The simple `git log FOO` make `git log` start at commit FOO and walk
down all the subgraph reachable from FOO.  Even when looking at this
simplest form you should understand that it does not prune *later*
commits since Git does not have explicit commit date ordering (since it
does not have any notion of monitonically growing revision number like
Subversion does), and it only prunes away parts of the history DAG
which are not reachable via *parent* commits of FOO.  For instance,
given the history


`git log FOO` won't consider QUUX, BLAH and MUMBLE, and while it's kind
of obvious why QUUX is not considered, the latter two are excluded even
though they might have been recorded *earlier* than FOO.  The reason
for this is that they are not reachable from FOO: when `git log` hits
BAZ, it can only move to BAR via BAZ's parent link, not to BLAH.

A bit more complicated case is `git log A B C D` where `git log` walks
all the subgraphs reachable from the listed commits.

Now let's embark on the most complicated case:
the call `git log A ^B ^C ^D` will:
* Consider a subgraph of the whole history DAG starting at commit A.
* Exclude from that subgraph all the subgraphs reachable from commits
  B, C and D, correspondingly.  The prefix '^' is used to turn the
  commit anchor into "excluding" rather than including.

Please think about this case until it sinks in: arguments to `git log`
operate on sets of commits; they do not consider any ordering based on
commit date.  Yes, most useful (for humans) applications of `git log`
indeed involve "excluding" commits which are historically earlier than
the starting commit but this is not a requirement!  For instance, in
our toy example, calling `git log FOO ^MUMBLE` would output just FOO as
specifying MUMBLE would trim off the subgraph starting from BAZ.

Now consider that `git log A..B` is just a shorthand for `git log B ^A`,
that is, it walks the subgraph reachable from B but excludes the
subgraph reachable from A.  To add to the confusion, the `git log`
manual page oversimplifies things even further and writes this
as <since>..<until> as if the commits are somehow ordered by date!
Yes, it immediately advises to consult the gitrevisions manual page [3]
but sure most folks rightfully assume it should contain some mundane
details and ignore it completely.  But in fact its "SPECIFYING RANGES"
is invaluable to understanding how `git log` really works!

See also [1] and [2].

1. https://groups.google.com/d/msg/git-users/fAOZgMTYNFA/JWpx-9C97VQJ
2. https://groups.google.com/d/msg/git-users/MPJCthuzuJs/7o4EFsaSbIgJ
3. https://www.kernel.org/pub/software/scm/git/docs/gitrevisions.html
4. http://core.tcl.tk
5. http://wiki.tcl.tk/1178

You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to