Re: Apparent bug in git-gc

2014-09-16 Thread Dale R. Worley
> From: Jeff King 
> 
> > I have an 11 GB repository.  It passes git-fsck (though with a number
> > of dangling objects).  But when I run git-gc on it, the file
> > refs/heads/master disappears.
> 
> That's the expected behavior. Gc runs "git pack-refs", which puts an
> entry into packed-refs and prunes the loose ref.

Arrrgh, damn!  I knew that but I forgot it.  Sorry about the trouble.

(At least I don't have to worry about fixing it now.)

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Apparent bug in git-gc

2014-09-15 Thread Dale R. Worley
I have an unpleasant bug in git-gc:

Git version:  1.8.3.1
Running on:  Fedora 19 Gnu/Linux

I have an 11 GB repository.  It passes git-fsck (though with a number
of dangling objects).  But when I run git-gc on it, the file
refs/heads/master disappears.  Since HEAD points to refs/heads/master,
this makes the repository unusable.

What should I do next?

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What happens when the repository is bigger than gc.autopacklimit * pack.packSizeLimit?

2014-08-29 Thread Dale R. Worley
> From: Jeff King 

> That makes sense, though I question whether packs are really helping you
> in the first place. I wonder if you would be better off keep your
> non-delta binaries as loose objects (this would require a new option to
> pack-objects and teaching "gc --auto" to ignore these when counting
> loose objects, but would be fairly straightforward).

Having 40,000 lose objects might be troublesome in its own way.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What happens when the repository is bigger than gc.autopacklimit * pack.packSizeLimit?

2014-08-29 Thread Dale R. Worley
> From: Junio C Hamano 

> But if your definition of the boundary between "small" and "large"
> is unreasonably low (and/or your definition of "too many" is
> unreasonably small), you will always have the problem you found.

I would propose that a pack whose size is "close enough" to
packSizeLimit should be assumed to have already been built by
repacking, and shouldn't count against autopacklimit.

That's easy to implement, and causes the desirable result that "git gc
--auto" isn't triggerable immediate after repacking.

Of course, eventually there will be enough loose objects, and
everything will get repacked (even the "full" packs).  But that will
happen only occasionally.

That does leave open the question of what is "close enough".  Off the
top of my head, a pack which is larger than packSizeLimit minus (the
size limit for files we put in packs) can be considered "full" in this
test.

Then again, maybe the solution is to just set autopacklimit very high,
perhaps even by default -- in real use, eventually the gc.auto test
will be triggered.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What happens when the repository is bigger than gc.autopacklimit * pack.packSizeLimit?

2014-08-29 Thread Dale R. Worley
> From: Jeff King 

> why are you setting the packsize limit to 99m in the first place?

I want to copy the Git repository to box.com as a backup measure, and
my account on box.com limits files to 100 MB.

> There are more delta opportunities

In this repository, only the smallest files are text files; the bulk
of the files are executable binaries.  So I've set
core.bigFileThreshold to 10k to stop Git from attempting
delta-compression of the binaries.  That makes the repository slightly
larger, but it dramatically speeds the repacking process.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


What happens when the repository is bigger than gc.autopacklimit * pack.packSizeLimit?

2014-08-27 Thread Dale R. Worley
[Previously sent to the git-users mailing list, but it probably should
be addressed here.]

A number of commands invoke "git gc --auto" to clean up the repository
when there might be a lot of dangling objects and/or there might be
far too many unpacked files.  The manual pages say:

git gc:
   --auto
   With this option, git gc checks whether any housekeeping is
   required; if not, it exits without performing any work. Some git
   commands run git gc --auto after performing operations that could
   create many loose objects.

   Housekeeping is required if there are too many loose objects or too
   many packs in the repository. If the number of loose objects
   exceeds the value of the gc.auto configuration variable, then all
   loose objects are combined into a single pack using git repack -d
   -l. Setting the value of gc.auto to 0 disables automatic packing of
   loose objects.

git config:
   gc.autopacklimit
   When there are more than this many packs that are not marked with
   *.keep file in the repository, git gc --auto consolidates them into
   one larger pack. The default value is 50. Setting this to 0
   disables it.

What happens when the amount of data in the repository exceeds
gc.autopacklimit * pack.packSizeLimit?  According to the
documentation, "git gc --auto" will then *always* repack the
repository, whether it needs it or not, because the data will require
more than gc.autopacklimit pack files.

And it appears from an experiment that this is what happens.  I have a
repository with pack.packSizeLimit = 99m, and there are 104 pack
files, and even when "git gc" is done, if I do "git gc --auto", it
will do git-repack again.

Looking at the code, I see:

builtin/gc.c:
static int too_many_packs(void)
{
struct packed_git *p;
int cnt;

if (gc_auto_pack_limit <= 0)
return 0;

prepare_packed_git();
for (cnt = 0, p = packed_git; p; p = p->next) {
if (!p->pack_local)
continue;
if (p->pack_keep)
continue;
/*
 * Perhaps check the size of the pack and count only
 * very small ones here?
 */
cnt++;
}
return gc_auto_pack_limit <= cnt;
}

Yes, perhaps you *should* check the size of the pack!

What is a good strategy for making this function behave as we want it to?

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git chokes on large file

2014-05-29 Thread Dale R. Worley
> From: David Lang 

> well, as others noted, the problem is actually caused by doing the diffs, and 
> that is something that is a very common thing to do with source code.

To some degree, my attitude comes from When I Was A Boy, when you got
16k for both your bytecode and your data, so you never kept more than
one line of a file in memory unless you had to.

Regardless of that, the problem with "git fsck" is not due to doing
diffs, and "git commit" by default does diffs even if you don't ask
for them, so the observed problems cannot be subsumed under the label
"you asked for a diff of a file that can't be diffed".

> And I would assume that files of several MB would be able to fit in
> memory (again, this was assumed to be for development, and compilers
> take a lot of ram to run, so having enough ram to hold any
> individual source file while the compiler is _not_ using ram doesn't
> seem likely to be a problem)

At least the first versions of GCC only kept one function (and the
global symbol table) in memory at once, so you could compile a source
file that was larger than the available memory.

In any case, if Git is no longer limited to handling source files, it
needs to be updated so it can handle large files.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git chokes on large file

2014-05-28 Thread Dale R. Worley
> From: David Lang 

> Git was designed to track source code, there are warts that show up
> in the implementation when you use individual files >4GB

I'd expect that if you want to deal with files over 100k, you should
assume that it doesn't all fit in memory.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git chokes on large file

2014-05-28 Thread Dale R. Worley
> From: Junio C Hamano 

> You need to have enough memory (virtual is fine if you have enough
> time) to do fsck.  Some part of index-pack could be refactored into
> a common helper function that could be called from fsck, but I think
> it would be a lot of work.

How much memory is "enough"?  And how is it that fsck is coded so that
the available RAM is a limit on which repositories can be checked?
Did someone forget that RAM may be considerably smaller than the
amount of data a program may have to deal with?

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git chokes on large file

2014-05-28 Thread Dale R. Worley
> From: Duy Nguyen 

> I don't know how many commands are hit by this. If you have time and
> gdb, please put a break point in die_builtin() function and send
> backtraces for those that fail. You could speed up the process by
> creating a smaller file and set the environment variable
> GIT_ALLOC_LIMIT (in kilobytes) to a number lower than that size. If
> git attempts to allocate a block larger than that limit it'll die.

I don't use Git enough to exercise it well.  And there are dozens of
commands with hundreds of options.

As someone else has noted, if I run 'git commit -q --no-status', it
doesn't crash.

It seems that much of Git was coded under the assumption that any file
could always be held entirely in RAM.  Who made that mistake?  Are
people so out of touch with reality?

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Git chokes on large file

2014-05-27 Thread Dale R. Worley
I've discovered a problem using Git.  It's not clear to me what the
"correct" behavior should be, but it seems to me that Git is failing
in an undesirable way.

The problem arises when trying to handle a very large file.  For
example:

$ git --version
git version 1.8.3.1
$ mkdir $$
$ cd $$
$ git init
Initialized empty Git repository in 
/common/not-replicated/worley/temp/5627/.git/
$ truncate --size=20G big_file
$ ls -l
total 0
-rw-rw-r--. 1 worley worley 21474836480 May 27 11:59 big_file
$ time git add big_file

real4m48.752s
user4m31.295s
sys 0m16.747s
$

At this point, either 'git fsck' or 'git commit' fails:

$ git fsck --full --strict
notice: HEAD points to an unborn branch (master)
Checking object directories: 100% (256/256), done.
fatal: Out of memory, malloc failed (tried to allocate 21474836481 bytes)

$ git commit -m Test.
[master (root-commit) 3df3655] Test.
fatal: Out of memory, malloc failed (tried to allocate 21474836481 bytes)

The central problem is that one can accidentally add a file that
leaves the repository in a "broken" state, where various normal
commands simply don't work.  The most worrying aspect is that "git
fsck" fails -- of all the commands, the one that verifies the validity
of the repository (and diagnoses errors) should be the most robust!

Even doing a 'git reset' does not put the repository in a state where
'git fsck' will complete:

$ git reset
$ git fsck --full --strict
notice: HEAD points to an unborn branch (master)
Checking object directories: 100% (256/256), done.
fatal: Out of memory, malloc failed (tried to allocate 21474836481 bytes)

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: "git fsck" fails on malloc of 80 G

2013-12-18 Thread Dale R. Worley
> From: Jeff King 

> One of the problems I ran into recently is that
> corrupt data can cause it to make a large allocation

One thing I notice is that in unpack_compressed_entry() in
sha1_file.c, there is a mallocz of "size" bytes.  It appears that
"size" is the size of the object that is being unpacked.  If so, this
code cannot be correct, because it assumes that any file that is
stored in the repository can be put into a buffer allocated in RAM.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: "git fsck" fails on malloc of 80 G

2013-12-17 Thread Dale R. Worley
> From: Jeff King 
> 
> On Mon, Dec 16, 2013 at 11:05:32AM -0500, Dale R. Worley wrote:
> 
> > # git fsck
> > Checking object directories: 100% (256/256), done.
> > fatal: Out of memory, malloc failed (tried to allocate 80530636801 bytes)
> > #
> 
> Can you give you give us a backtrace from the die() call? It would help
> to know what it was trying to allocate 80G for.

Further information:

# git --version
git version 1.8.3.1
#

Here's the basic backtrace information, and the values of the "size"
variables, which seem to be the immediate culprits:

# gdb
GNU gdb (GDB) Fedora 7.6.1-46.fc19
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
(gdb) file /usr/bin/git
Reading symbols from /usr/bin/git...Reading symbols from 
/usr/lib/debug/usr/bin/git.debug...done.
done.
(gdb) break wrapper.c:59
Breakpoint 1 at 0x4f35ef: file wrapper.c, line 59.
(gdb) break die_child
Breakpoint 2 at 0x4d0ca0: file run-command.c, line 211.
(gdb) break die_async
Breakpoint 3 at 0x4d1020: file run-command.c, line 604.
(gdb) run fsck
Starting program: /usr/bin/git fsck
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Checking object directories: 100% (256/256), done.
Checking objects:   0% (0/526211)   
Breakpoint 1, xmalloc (size=size@entry=80530636801) at wrapper.c:59
59  die("Out of memory, malloc failed (tried to 
allocate %lu bytes)",
(gdb) bt
#0  xmalloc (size=size@entry=80530636801) at wrapper.c:59
#1  0x004f3633 in xmallocz (size=size@entry=80530636800)
at wrapper.c:73
#2  0x004d922f in unpack_compressed_entry (p=p@entry=0x7e4020, 
w_curs=w_curs@entry=0x7fffc9f0, curpos=654214694, size=80530636800)
at sha1_file.c:1797
#3  0x004db4cb in unpack_entry (p=p@entry=0x7e4020, 
obj_offset=654214688, final_type=final_type@entry=0x7fffd088, 
final_size=final_size@entry=0x7fffd098) at sha1_file.c:2072
#4  0x004b1e3f in verify_packfile (base_count=0, progress=0x9bdd80, 
fn=0x42fc00 , w_curs=0x7fffd090, p=0x7e4020)
at pack-check.c:119
#5  verify_pack (p=p@entry=0x7e4020, fn=fn@entry=0x42fc00 
, 
progress=0x9bdd80, base_count=base_count@entry=0) at pack-check.c:177
#6  0x00430724 in cmd_fsck (argc=0, argv=0x7fffe400, 
prefix=) at builtin/fsck.c:678
#7  0x00405cfd in run_builtin (argv=0x7fffe400, argc=1, 
p=0x75fa68 ) at git.c:284
#8  handle_internal_command (argc=1, argv=0x7fffe400) at git.c:446
#9  0x0040511f in run_argv (argv=0x7fffe2a0, 
argcp=0x7fffe2ac)
at git.c:492
#10 main (argc=1, argv=0x7fffe400) at git.c:567
(gdb) frame 2
#2  0x004d922f in unpack_compressed_entry (p=p@entry=0x7e4020, 
w_curs=w_curs@entry=0x7fffc9f0, curpos=654214694, size=80530636800)
at sha1_file.c:1797
1797buffer = xmallocz(size);
(gdb) p size
$29 = 80530636800
(gdb) p/x size
$30 = 0x12c000
(gdb) frame 3
#3  0x004db4cb in unpack_entry (p=p@entry=0x7e4020, 
obj_offset=654214688, final_type=final_type@entry=0x7fffd088, 
final_size=final_size@entry=0x7fffd098) at sha1_file.c:2072
2072data = unpack_compressed_entry(p, 
&w_curs, curpos, size);
(gdb) p size
$31 = 80530636800
(gdb) p/x size
$32 = 0x12c000
(gdb) 

I did a further test to see where the value of "size" came from:

(gdb) break sha1_file.c:2023
Breakpoint 4 at 0x4db073: file sha1_file.c, line 2023.
(gdb) cond 4 size == 0x12c000
(gdb) break sha1_file.c:2029
Breakpoint 5 at 0x4daee7: file sha1_file.c, line 2029.
(gdb) cond 5 size == 0x12c000
(gdb) break sha1_file.c:2072
Breakpoint 6 at 0x4db4b4: file sha1_file.c, line 2072.
(gdb) cond 6 size == 0x12c000
(gdb) break unpack_object_header_buffer
Breakpoint 7 at 0x4d9ea0: file sha1_file.c, line 1399.
(gdb) comm 7
Type commands for breakpoint(s) 7, one per line.
End with a line saying just "end".
>continue
>end
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program:

"git fsck" fails on malloc of 80 G

2013-12-16 Thread Dale R. Worley
I have a large repository (17 GiB of disk used), although no single
file in the repository is over 1 GiB.  (I have pack.packSizeLimit set
to "1g".)  I don't know how many files are in the repository, but it
shouldn't exceed several tens of commits each containing several tens
of thousands of files.

Due to Git crashing while performing an operation, I want to verify
that the repository is consistent.  However, when I run "git fsck" it
fails, apparently because it is trying to allocate 80 G of memory.  (I
can still do adds, commits, etc.)

# git fsck
Checking object directories: 100% (256/256), done.
fatal: Out of memory, malloc failed (tried to allocate 80530636801 bytes)
#

I don't know if this is due to an outright bug or not.  But it seems
to me that "git fsck" should not need to allocate any more memory than
the size (1 GiB) of a single pack file.  And given its purpose, "git
fsck" should be one of the *most* robust Git tools!

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Working patterns

2013-10-23 Thread Dale R. Worley
> The pattern I use is to have this:
> 
> /repository/.git
>   with core.worktree = /working
> /working/...
> 
> then
> 
> cd /repository
> git add /working/x/y
> git ...

The point I'm trying to make is that it appears that all of the Git
commands implemented as programs use the same code at the beginning to
determine the location of the Git repository and the root directory of
the work tree -- to determine GIT_DIR and GIT_WORK_TREE effectively.
And the code works with my work pattern, which seems intuitively
correct to me.

It seems to me that this code should implement the design, and the
design should match what the code does.

However, the parallel code in the Git commands that are scripts, which
seems to be in git-sh-setup, does not implement the same design as the
C code does.

It seems to me that the two sets of Git commands should be invokable
under the same circumstances, that there is a design specification as
to how Git can be invoked, and both implementations should match that.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git-users] Problem using detached worktrees with commands implemented in scripts

2013-10-21 Thread Dale R. Worley
> From: Junio C Hamano 

> > ... it's not clear why GIT_WORK_TREE exists, ...
> 
> The configuration item came _way_ later than the environment, and we
> need to keep users and scripts from old world working, that is why.

OK, that explains a great deal.  IIRC, I first became aware that
detached worktrees are possible through the documentation of
core.worktree.  As Git's architecture has a tight binding between the
repository and the worktree, it made a great deal of sense to me that
the repository points to the detached worktree.  And the absence of
core.worktree, a non-detached worktree, is essentially equivalent to
having core.worktree specify the directory containing the .git
directory.

So the obvious way (to me) to invoke Git with a detached worktree is
to set GIT_DIR to point to the repository, and the repository points
to the root of the worktree.  If the command operates on the worktree,
Git can compare the cwd with the worktree root to determine the
relative path of files.

(And you can see that in this situation, Git doesn't have to search
upward to try to determine where the worktree root is.)

What you're saying is that there's an older mode of operation where
the repository does not point to the worktree.  Instead, the caller
has to set GIT_DIR to locate the repository and GIT_WORK_TREE to
locate the worktree.  This would be prone to error, because the user
is responsible for always pairing the repository with the correct
worktree; it doesn't enforce the architectural assumption that the
repository is paired with a work tree.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git-users] Problem using detached worktrees with commands implemented in scripts

2013-10-18 Thread Dale R. Worley
> From: Junio C Hamano 

> It was unclear to me which part of our documentation needs updating
> and how, and that was (and still is) what I was primarily interested
> in finding out.

It seems to me that what is missing is a description of the
circumstances under which Git can be run.  With Subversion (the only
other source control system I know in detail), the working tree that
is operated on is at and below the cwd, and the working tree always
points to the repository.  (A subdirectory of a working tree is also a
valid working tree.)

With Git, it seems that the basic usage is that Git searches upward
from the cwd to find the top of the work tree, which is distinguished
by having a .git subdirectory.  The rules when the worktree is
detached are more complicated, and don't seem to be written in any
single place.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git-users] Problem using detached worktrees with commands implemented in scripts

2013-10-18 Thread Dale R. Worley
> From: Junio C Hamano 

> Now, when you say "the cwd contains the .git directory", do you mean
> 
>   cd /repositories
> git add ../working/trees/proj-wt1/file
> 
> updates "file" in the /repositories/proj.git/index?  Or do you mean
> this?

The pattern I use is to have this:

/repository/.git
/working/...

then

cd /repository
git add /working/x/y/z

works as you'd expect it to.  "git rm" seems to work correctly under
these circumstances as well.

I seem to recall that using relative  values doesn't work under
some conditions involving symbolic links, but I can't recall the
details right now.

> you talk about starting Git command _outside_ the working tree
> (whether the working tree has its repository embedded in it is not
> very relevant).

The above pattern is what I mean, where the cwd is not within the work
tree.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git-users] Problem using detached worktrees with commands implemented in scripts

2013-10-18 Thread Dale R. Worley
> From: Junio C Hamano 

>   Side note: without GIT_WORK_TREE environment (or
>   core.worktree), there is no way to tell where the top level
>   is, so you were limited to always be at the top level of
>   your working tree if you used GIT_DIR to refer to a
>   repository that is not embedded in your working tree.  There
>   were some changes in this area, but I do not recall the
>   details offhand.

That's not true.  The core.worktree config variable tells the top of
the worktree, so once you've located the repository, you know where
the worktree is.

Indeed, it's not clear why GIT_WORK_TREE exists, as that allows the
user to set GIT_WORK_TREE inconsistently with core.worktree.  (What
happens if you do that?)

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git-users] Problem using detached worktrees with commands implemented in scripts

2013-10-17 Thread Dale R. Worley
> From: Junio C Hamano 
> 
> wor...@alum.mit.edu (Dale R. Worley) writes:
> 
> > In general, Git commands on a repository with a detached worktree can
> > be executed by cd'ing into the directory containing the .git
> > directory, ...
> 
> Eh?  News to me; it might happened to have appeared to work by
> accident, but that is not by design.

I must admit I've never seen the design (and I personally doubt that
the design has ever been written down).  But at least the following
commands work correctly on a detached worktree if the current
directory contains the .git directory, because I am using them in a
production manner:

git add
git cat-file
git commit
git commit-tree
git config
git gc
git log
git ls-tree
git reset
git rev-list
git update-ref

In my situation, the worktree is not, in my mind, dependent on the
repository; the repository is intended to keep backups of the contents
of the directories that are worktree.  Indeed, one could establish
several detached repositories to back up different subsets of the same
worktree.  So it is conceptually natural to execute Git in the
repository directory.  And, after all, the current directory
identifies the repository and the repository contains a pointer to the
worktree.

> Not exporting GIT_DIR variable in sh-setup was done not by accident
> but as a very deliberate design choice, IIRC.

The intention of my change is that it appears that all of the failures
of my use pattern are when the command is implemented by a shell
script, and it appears that all shell scripts initially invoke
git-sh-setup.

The change specifically detects my use pattern and, for the remainder
of the script, changes the use pattern into a pattern closely related
to the one that Junio documents:

 - export GIT_DIR that points at the correct .git directory;

 - GIT_WORK_TREE is left unset

 - set the current directory to the top of the working tree

Perhaps the change should also set GIT_WORK_TREE.  I haven't noticed
setting GIT_WORK_TREE to be necessary for "git filter-branch", but
perhaps that is coincidental.

It seems to me that this change would uniformly support the use
pattern I use without affecting Git's behavior in any other case.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[git-users] Problem using detached worktrees with commands implemented in scripts

2013-10-16 Thread Dale R. Worley
In Git, one can set up a repository with a "detached worktree", where
the .git directory is not a subdirectory of the top directory of the
work tree.

In general, Git commands on a repository with a detached worktree can
be executed by cd'ing into the directory containing the .git
directory, and executing the Git command there.  E.g., "git add" and
"git commit" execute as one would expect.  (I think they can also be
executed by cd'ing to the worktree and setting GIT_DIR.)

However, this approach does not work with "git filter-branch", which
objects with "You need to run this command from the toplevel of the
working tree."

I suspect that it does not work with other Git commands that are
implemented with shell scripts.  The problem appears to be in the
git-sh-setup script, which is called by the Git shell scripts to set
up the environment and do preliminary tests.

It seems to me that this inconsistency between the script commands and
the binary commands can be fixed by updating git-sh-setup in this way:

--- git-sh-setup.Custom.orig2013-06-20 12:59:45.0 -0400
+++ git-sh-setup2013-10-07 22:34:06.719946134 -0400
@@ -297,14 +297,18 @@
 # if we require to be in a git repository.
 if test -z "$NONGIT_OK"
 then
-   GIT_DIR=$(git rev-parse --git-dir) || exit
+   export GIT_DIR=$(git rev-parse --git-dir) || exit
if [ -z "$SUBDIRECTORY_OK" ]
then
-   test -z "$(git rev-parse --show-cdup)" || {
-   exit=$?
-   echo >&2 "You need to run this command from the 
toplevel of the working tree."
-   exit $exit
-   }
+   cdup="$(git rev-parse --show-cdup)"
+   if [ -n "$cdup" ]
+   then
+   # Current directory is not the toplevel.
+   # Set GIT_DIR to the absolute path of the repository.
+   GIT_DIR=$(cd "$GIT_DIR" && pwd)
+   # cd to the toplevel.
+   cd $cdup
+   fi
fi
test -n "$GIT_DIR" && GIT_DIR=$(cd "$GIT_DIR" && pwd) || {
echo >&2 "Unable to determine absolute path of git directory"

What this change does is, when a command is invoked from a directory
containing a repository with a detached worktree, is to set GIT_DIR to
the directory of the repository, then cd to the top of the worktree.
After that, the script command should work as expected.

I am far from being an expert in Git internals, so I don't know
whether this is the correct approach to take to this problem or not.

Does anyone have any feedback on this?

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Using filter-branch without messing up the working copy

2013-10-03 Thread Dale R. Worley
I'm working on using "git filter-branch" to remove the history of a
large file from my repository so as to reduce the size of the
repository.  This pattern of use is effective for me:

1. $ git filter-branch --index-filter 'git rm --cached --ignore-unmatch 
core.4563' HEAD

2. edit .git/packed-refs to remove the line
   "c6589bef2c776277407686a3a81c5a76bfe41ba2 refs/original/refs/heads/hobgoblin"
(so that there is no reference to the old HEAD)

3. git gc --quiet --aggressive

The difficulty I am having is that I do not want to disturb the
working copy while doing this.  The working copy is my entire home
directory, because I am using the repository as a historical backup
system for my home directory.  Currently, Git will not execute
filter-branch if there are unstaged changes in the working copy,
despite the fact that "git filter-branch --index-filter" does not
touch the working copy itself.  (In this case, the file to be deleted,
"core.4563", is not now in the working copy.)

A further difficulty is that the repository is "remote", the .git
directory is not in my home directory, and core.worktree is
"/home/worley".

The best set of steps I have found that accomplishes my goals is to
(1) "disconnect" the repository from the working copy by removing the
core.worktree value, (2) use "git checkout -q -f master" to create in
the repository's location an entire copy of my home directory, (3)
perform the above filtering steps, (4) "reconnect" the repository by
adding the core.worktree value, and (5) deleting the temporary working
copy files.

Surely there is a better way than this.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: the pager

2013-09-02 Thread Dale R. Worley
> I've noticed that Git by default puts long output through "less" as a
> pager.  I don't like that, but this is not the time to change
> established behavior.  But while tracking that down, I noticed that
> the paging behavior is controlled by at least 5 things:
> 
> the -p/--paginate/--no-pager options
> the GIT_PAGER environment variable
> the PAGER environment variable
> the core.pager Git configuration variable
> the build-in default (which seems to usually be "less")

One complication is the meaning of -p/--no-pager:

With the remaining sources, we assume that there is a priority
sequence, and that is used to determine what the pager is.

There is a somewhat independent question of when the pager is
activated.  What I know so far is that some commands use the pager by
default and some by default do not.  My expectation is that
--no-pager can be used to suppress the pager for *any* command.  Is it
also true that -p can force the pager for *any* command, or are there
commands which will not page even with -p?

I assume that if -p is specified but the "which pager" selection is
"cat" (or some other specification of no pager), then there is no
paging operation.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: the pager

2013-09-02 Thread Dale R. Worley
> From: Matthieu Moy 

> > const char *git_pager(int stdout_is_tty)
> > {
> > const char *pager;
> >
> > if (!stdout_is_tty)
> > return NULL;
> >
> > pager = getenv("GIT_PAGER");
> > if (!pager) {
> > if (!pager_program)
> > git_config(git_default_config, NULL);
> > pager = pager_program;
> > }
> > if (!pager)
> > pager = getenv("PAGER");
> > if (!pager)
> > pager = DEFAULT_PAGER;
> > else if (!*pager || !strcmp(pager, "cat"))
> > pager = NULL;
> 
> I guess the "else" could and should be dropped. If you do so (and
> possibly insert a blank line between the DEFAULT_PAGER case and the
> "pager = NULL" case), you get a nice pattern
> 
> if (!pager)
>   try_something();
> if (!pager)
>   try_next_option();

That's true, but it would change the effect of using "cat" as a value:
"cat" as a value of DEFAULT_PAGER would cause git_pager() to return
NULL, whereas now it causes git_pager() to return "cat".  (All other
places where "cat" can be a value are translated to NULL already.)

This is why I want to know what the *intended* behavior is, because we
might be changing Git's behavior, and I want to know that if we do
that, we're changing it to what it should be.  And I haven't seen
anyone venture an opinion on what the intended behavior is.

> I agree that a comment like this would help, though:
> 
> --- a/cache.h
> +++ b/cache.h
> @@ -1266,7 +1266,7 @@ static inline ssize_t write_str_in_full(int fd, const 
> char *str)
>  
>  /* pager.c */
>  extern void setup_pager(void);
> -extern const char *pager_program;
> +extern const char *pager_program; /* value read from git_config() */
>  extern int pager_in_use(void);
>  extern int pager_use_color;
>  extern int term_columns(void);

First off, the wording is wrong, it should be "value set by
git_config()".

But that doesn't tell the reader what the significance of the value
is.  I suspect that a number of global variables need to be marked:

> /* The pager program name, or "cat" if there is no pager.
>  * Can be overridden by the pager. configuration value for a
>  * single command, or suppressed by the --no-pager option.
>  * Set by calling git_config().
>  * NULL if hasn't been set yet by calling git_config(). */
> extern const char *pager_program;

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2] git-diff: Clarify operation when not inside a repository.

2013-08-29 Thread Dale R. Worley
> From: Junio C Hamano 

> diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
> index b1630ba..33fbd8c 100644
> --- a/Documentation/git-diff.txt
> +++ b/Documentation/git-diff.txt
> @@ -28,11 +28,15 @@ two blob objects, or changes between two files on disk.
>   words, the differences are what you _could_ tell Git to
>   further add to the index but you still haven't.  You can
>   stage these changes by using linkgit:git-add[1].
> -+
> -If exactly two paths are given and at least one points outside
> -the current repository, 'git diff' will compare the two files /
> -directories. This behavior can be forced by --no-index or by
> -executing 'git diff' outside of a working tree.
> +
> +'git diff' --no-index [--options] [--] [...]::
> +
> + This form is to compare the given two paths on the
> + filesystem.  You can omit the `--no-index` option when
> + running the command in a working tree controlled by Git and
> + at least one of the paths points outside the working tree,
> + or when running the command outside a working tree
> + controlled by Git.

That does break out the --no-index case in a clearer way.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: the pager

2013-08-29 Thread Dale R. Worley
So I set out to verify in the code that the order of priority of pager
specification is

GIT_PAGER > core.pager > PAGER > default

I discovered that there is also a pager. configuration
variable.

I was expecting the code to be simple, uniform (with regard to the 5
sources), and reasonably well documented.  The relevant parts of the
code that I have located so far include:

in environment.c:

const char *pager_program;

in config.c:

int git_config_with_options(config_fn_t fn, void *data,
const char *filename,
const char *blob_ref,
int respect_includes)
{
char *repo_config = NULL;
int ret;
struct config_include_data inc = CONFIG_INCLUDE_INIT;

if (respect_includes) {
inc.fn = fn;
inc.data = data;
fn = git_config_include;
data = &inc;
}

/*
 * If we have a specific filename, use it. Otherwise, follow the
 * regular lookup sequence.
 */
if (filename)
return git_config_from_file(fn, filename, data);
else if (blob_ref)
return git_config_from_blob_ref(fn, blob_ref, data);

repo_config = git_pathdup("config");
ret = git_config_early(fn, data, repo_config);
if (repo_config)
free(repo_config);
return ret;
}

in pager.c:

/* returns 0 for "no pager", 1 for "use pager", and -1 for "not specified" 
*/
int check_pager_config(const char *cmd)
{
struct pager_config c;
c.cmd = cmd;
c.want = -1;
c.value = NULL;
git_config(pager_command_config, &c);
if (c.value)
pager_program = c.value;
return c.want;
}

const char *git_pager(int stdout_is_tty)
{
const char *pager;

if (!stdout_is_tty)
return NULL;

pager = getenv("GIT_PAGER");
if (!pager) {
if (!pager_program)
git_config(git_default_config, NULL);
pager = pager_program;
}
if (!pager)
pager = getenv("PAGER");
if (!pager)
pager = DEFAULT_PAGER;
else if (!*pager || !strcmp(pager, "cat"))
pager = NULL;

return pager;
}

What's with the code?  It's not simple, it's not uniform (e.g.,
setting env. var. PAGER to "cat" will cause git_pager() to return
NULL, but setting preprocessor var. DEFAULT_PAGER to "cat" will cause
it to return "cat"), and it's barely got any comments at all (a global
variable has *no description whatsoever*).

I'd like to clean up the manual pages at least, but it would take me
hours to figure out what the code *does*.

I know I'm griping here, but I thought that part of the reward for
contributing to an open-source project was as a showcase of one's
work.  Commenting your code is what you learn first in programming.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: the pager

2013-08-28 Thread Dale R. Worley
> From: Junio C Hamano 
> 
> > I've noticed that Git by default puts long output through "less" as a
> > pager.  I don't like that, but this is not the time to change
> > established behavior.  But while tracking that down, I noticed that
> > the paging behavior is controlled by at least 5 things:
> >
> > the -p/--paginate/--no-pager options
> > the GIT_PAGER environment variable
> > the PAGER environment variable
> > the core.pager Git configuration variable
> > the build-in default (which seems to usually be "less")
> > ...
> > What is the (intended) order of precedence of specifiers of paging
> > behavior?  My guess is that it should be the order I've given above.
> 
> I think that sounds about right (I didn't check the code, though).
> The most specific to the command line invocation (i.e. option)
> trumps the environment, which trumps the configured default, and the
> hard wired stuff is used as the fallback default.
> 
> I am not sure about PAGER environment and core.pager, though.
> People want Git specific pager that applies only to Git process
> specified to core.pager, and still want to use their own generic
> PAGER to other programs, so my gut feeling is that it would make
> sense to consider core.pager a way to specify GIT_PAGER without
> contaminating the environment, and use both to override the generic
> PAGER (in other words, core.pager should take precedence over PAGER
> as far as Git is concerned).

I've just discovered this bit of documentation.  Within the git-var
manual page is this entry:

   GIT_PAGER
   Text viewer for use by git commands (e.g., less). The value is
   meant to be interpreted by the shell. The order of preference is
   the $GIT_PAGER environment variable, then core.pager configuration,
   then $PAGER, and then finally less.

This suggests that the ordering is GIT_PAGER > core.pager > PAGER >
default.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


the pager

2013-08-26 Thread Dale R. Worley
I've noticed that Git by default puts long output through "less" as a
pager.  I don't like that, but this is not the time to change
established behavior.  But while tracking that down, I noticed that
the paging behavior is controlled by at least 5 things:

the -p/--paginate/--no-pager options
the GIT_PAGER environment variable
the PAGER environment variable
the core.pager Git configuration variable
the build-in default (which seems to usually be "less")

There is documentation in git.1 and git-config.1, and the two are not
coordinated to make it clear what happens in all cases.  And the
built-in default is not mentioned at all.

What is the (intended) order of precedence of specifiers of paging
behavior?  My guess is that it should be the order I've given above.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2] git-diff: Clarify operation when not inside a repository.

2013-08-23 Thread Dale R. Worley
> From: Junio C Hamano 

> I suspect that it may be a good idea to split the section altogether
> to reduce confusion like what triggered this thread, e.g.
> 
> 'git diff' [--options] [--] [...]::
> 
> This form is to view the changes you made relative to
> the index (staging area for the next commit).  In other
> words, the differences are what you _could_ tell Git to
> further add to the index but you still haven't.  You can
> stage these changes by using linkgit:git-add[1].
> 
> 'git diff' --no-index [--options] [--]  ::
> 
>   This form is to compare the given two paths on the
>   filesystem.  When run in a working tree controlled by
>   Git, if at least one of the paths points outside the
>   working tree, or when run outside a working tree
>   controlled by Git, you can omit the `--no-index` option.
> 
> For now, I'll queue your version as-is modulo style fixes, while
> waiting for others to help polishing the documentation better.

It'd difficult to figure out how to describe it well.  In my opinion,
the problem here is the DWIM nature of the command, which means that
there is a lot of interaction between the options that are specified,
the number of path arguments, and the circumstances.  My preference is
for "do what I say", that the options restrict the command to operate
in exactly one way, which determines the way the paths are used (and
thus their number) and the context in which it can be used.  But
that's not how git-diff works.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2] git-diff: Clarify operation when not inside a repository.

2013-08-22 Thread Dale R. Worley
Clarify documentation for git-diff:  State that when not inside a
repository, --no-index is implied (and thus two arguments are
mandatory).

Clarify error message from diff-no-index to inform user that CWD is
not inside a repository and thus two arguments are mandatory.

Signed-off-by: Dale Worley 
---


The error message has been updated from [PATCH].  "git diff" outside a
repository now produces:

Not a git repository
To compare two paths outside a working tree:
usage: git diff [--no-index]  

This should inform the user of his error regardless of whether he
intended to perform a within-repository "git diff" or an
out-of-repository "git diff".

This message is closer to the message that other Git commands produce:

fatal: Not a git repository (or any parent up to mount parent )
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

"git diff --no-index" produces the same message as before (since the
user is clearly invoking the non-repository behavior):

usage: git diff --no-index  

Regarding the change to git-diff.txt, perhaps "forced ... by executing
'git diff' outside of a working tree" is not the best wording, but it
should be clear to the reader that (1) it is possible to execute 'git
diff' outside of a working tree, and (2) when doing so, the behavior
will be as if '--no-index' was specified.

I've also added some comments for the new code.


 Documentation/git-diff.txt |3 ++-
 diff-no-index.c|   12 +++-
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
index 78d6d50..9f74989 100644
--- a/Documentation/git-diff.txt
+++ b/Documentation/git-diff.txt
@@ -31,7 +31,8 @@ two blob objects, or changes between two files on disk.
 +
 If exactly two paths are given and at least one points outside
 the current repository, 'git diff' will compare the two files /
-directories. This behavior can be forced by --no-index.
+directories. This behavior can be forced by --no-index or by 
+executing 'git diff' outside of a working tree.
 
 'git diff' [--options] --cached [] [--] [...]::
 
diff --git a/diff-no-index.c b/diff-no-index.c
index e66fdf3..9734ec3 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -215,9 +215,19 @@ void diff_no_index(struct rev_info *revs,
 path_inside_repo(prefix, argv[i+1])))
return;
}
-   if (argc != i + 2)
+   if (argc != i + 2) {
+   if (!no_index) {
+   /* There was no --no-index and there were not two
+* paths.  It is possible that the user intended
+* to do an inside-repository operation. */
+   fprintf(stderr, "Not a git repository\n");
+   fprintf(stderr,
+   "To compare two paths outside a working 
tree:\n");
+   }
+   /* Give the usage message for non-repository usage and exit. */
usagef("git diff %s  ",
   no_index ? "--no-index" : "[--no-index]");
+   }
 
diff_setup(&revs->diffopt);
for (i = 1; i < argc - 2; ) {
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] git-diff: Clarify operation when not inside a repository.

2013-08-21 Thread Dale R. Worley
Clarify documentation for git-diff:  State that when not inside a
repository, --no-index is implied (and thus two arguments are
mandatory).

Clarify error message from diff-no-index to inform user that CWD is
not inside a repository and thus two arguments are mandatory.

Signed-off-by: Dale Worley 
---


This clarification is to avoid a problem I ran into.  I executed 'git
diff' in the remote working tree of a repository, and not in the
repository directory itself.  Because of that, git-diff assumed
git-diff --no-index, and executed diff-no-index.  Since I hadn't
provided paths, diff-no-index produced an error message.
Unfortunately, the error message presupposes that the decision to
execute diff-no-index reflects the user's intention, thus leaving me
confused, as the error message is only:
usage: git diff [--no-index]  
and does not cover the case I intended.  This patch changes the
message to notify the user that he is getting --no-index semantics
because he is outside of a repository:
Not within a git repository:
usage: git diff [--no-index]  
The additional line is suppressed if the user specified --no-index.

The documentation is expanded to state that execution outside of a
repository forces --no-index behavior.  Previously, the manual implied
this but did not state it, making it easy for the user to overlook
that it's possible to run git-diff outside of a repository.

Dale


 Documentation/git-diff.txt |3 ++-
 diff-no-index.c|6 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
index 78d6d50..9f74989 100644
--- a/Documentation/git-diff.txt
+++ b/Documentation/git-diff.txt
@@ -31,7 +31,8 @@ two blob objects, or changes between two files on disk.
 +
 If exactly two paths are given and at least one points outside
 the current repository, 'git diff' will compare the two files /
-directories. This behavior can be forced by --no-index.
+directories. This behavior can be forced by --no-index or by 
+executing 'git diff' outside of a working tree.
 
 'git diff' [--options] --cached [] [--] [...]::
 
diff --git a/diff-no-index.c b/diff-no-index.c
index e66fdf3..98c5f76 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -215,9 +215,13 @@ void diff_no_index(struct rev_info *revs,
 path_inside_repo(prefix, argv[i+1])))
return;
}
-   if (argc != i + 2)
+   if (argc != i + 2) {
+   if (!no_index) {
+   fprintf(stderr, "Not within a git repository:\n");
+   }
usagef("git diff %s  ",
   no_index ? "--no-index" : "[--no-index]");
+   }
 
diff_setup(&revs->diffopt);
for (i = 1; i < argc - 2; ) {
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH revised] git_mkstemps: add test suite test

2013-08-06 Thread Dale R. Worley
> git commit --allow-empty -m message <&-

Though as of [fb56570] "Sync with maint to grab trivial doc fixes",
that test doesn't fail for me if I revert to

fd = open(pattern, O_CREAT | O_EXCL | O_RDWR, mode);
if (fd > 0)
return fd;

I haven't been watching the code changes carefully; has there been a
fix that is expected to cause that?

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH revised] git_mkstemps: add test suite test

2013-08-06 Thread Dale R. Worley
> From: Junio C Hamano 
> 
> Thanks. I thought I've already queued 
> 
> Message-ID: <7vfvuokpr0@alter.siamese.dyndns.org>
> aka 
> http://article.gmane.org/gmane.comp.version-control.git/231680
> 
> which tests
> 
> git commit --allow-empty -m message <&-

My mistake...  I've been so intent on revising my repository and
rewriting the patch that I overlooked that you'd done the revision
already.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] SubmittingPatches: clarify some details of the patch format

2013-08-06 Thread Dale R. Worley
---
This is a first draft of a patch that clarifies a number of points
about how patches should be formatted that have tripped me up.  I have
re-filled a few of the paragraphs, which makes it hard to see from the
diff what I've changed.  This listing shows the changed words between
{ ... }:

 { This first line should be a separate paragraph, that is, it should be
followed by an empty line, which is then followed by the body of the
commit message } .

 { At the end of the commit message should be a Signed-off-by line giving
your name.  This can be added to the commit message automatically by
giving "git commit" the "--signoff" option.  This line has legal
implications, see "Sign your work" below } .

People on the Git mailing list need to be able to read and comment on
the changes you are submitting.  It is important for a developer to be
able to "quote" your changes, using standard e-mail tools, so that
they may comment on specific portions of your code.  For this reason,
all patches should be submitted "inline" { rather than as message
attachments } .  If your log message (including your name on the
Signed-off-by line) is not writable in ASCII, make sure that you send
off { the } message in the correct encoding.

"git format-patch" command follows the best current practice to
format the { patch for transmission as an e-mail message.  The files
that it outputs are sample e-mail messages in "mh" format.  The
initial lines are sample From, To, Date, and Subject headers, which
you will likely have to remove or revise, depending on how your MUA
operates } .

At the beginning of the patch should come your commit message ( { the
first line in the Subject header, the remainder in the body of the
message } ), ending with the Signed-off-by: lines, and a line that
consists of three dashes, followed by the diffstat information and the
patch itself.  If you are forwarding a patch from somebody else,
optionally, at the beginning of the e-mail message just before the
commit message starts, you can put a "From: " line to name that
person.

You often want to add additional explanation about the patch,
other than the commit message itself.  Place such "cover letter"
material between the three-dash { line } and the diffstat. Git-notes
can also be inserted using the `--notes` option.

> From: Junio C Hamano 
> 
> I am not sure if SubmittingPatches is a good place, though.
> The document is a guidance for people who contribute to _this_
> project.
> 
> But the specialness of the first paragraph applies to any project
> that uses Git, so people other than those who contribute to this
> project should be aware of it.

All of that is true.  But what can we do to ensure that someone who
submits a patch does so with the right format?  The special treatment
of the first line is not a requirement.  See, e.g., the git-commit
manual page:

   Though not required, it’s a good idea to begin the commit message with
   a single short (less than 50 character) line summarizing the change,
   followed by a blank line and then a more thorough description. Tools
   that turn commits into email, for example, use the first line on the
   Subject: line and the rest of the commit in the body.

> Originally we literally used "first line", but that made many things
> like shortlog output and patch Subject: useless when people write a
> block of text starting from the first line without a title.  Also
> after resurrecting such a text from e-mail, "am" couldn't tell if
> the "first line" on the "Subject:" is meant to be the first line of
> the same first paragraph (which is not what we encourage), or it is
> properly a single line title, and need a blank line before the first
> line of the body.  So quite a while ago, we changed the rule to take
> "the first paragraph" and use that in these places where we want to
> give a title of a patch.

And as you note, if organizing the first line this way was a
requirement, git-am would be able to unambiguously reconstruct the
commit message from an e-mail.  The only way to minimize errors in
this probject is to clearly specify what is required within this
project.

Dale


 Documentation/SubmittingPatches |   47 +-
 1 files changed, 31 insertions(+), 16 deletions(-)

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index d0a4733..e974dd7 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -85,6 +85,10 @@ identifier for the general area of the code being modified, 
e.g.
 If in doubt which identifier to use, run "git log --no-merges" on the
 files you are modifying to see the current conventions.
 
+This first line should be a separate paragraph, that is, it should be
+followed by an empty line, which is then followed by the body of the
+commit message.
+
 The body 

[PATCH revised] git_mkstemps: add test suite test

2013-08-06 Thread Dale R. Worley
Commit a2cb86 ("git_mkstemps: correctly test return value of open()",
12 Jul 2013) fixes a bug regarding testing the return of an open()
call for success/failure.  Add a testsuite test for that fix.  The
test exercises a situation where that open() is known to return 0.

Signed-off-by: Dale Worley 
---
This version of the patch cleans up a number of errors in my previous
version (which were ultimately due to my faulty updating of my master
branch).  The commit that added the open() test is now correctly
described.  Since the test was not present in the test suite at all,
the patch is described as adding the test rather than improving it.

a2cb86 is on branch tr/fd-gotcha-fixes, but that has been merged into
master now.

(Thanks for your patience with this.)

Dale

 t/t0070-fundamental.sh | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/t/t0070-fundamental.sh b/t/t0070-fundamental.sh
index 986b2a8..d427f3a 100755
--- a/t/t0070-fundamental.sh
+++ b/t/t0070-fundamental.sh
@@ -25,6 +25,13 @@ test_expect_success POSIXPERM,SANITY 'mktemp to unwritable 
directory prints file
grep "cannotwrite/test" err
 '
 
+test_expect_success 'git_mkstemps_mode does not fail if fd 0 is not open' '
+   git init &&
+   echo Test. >test-file &&
+   git add test-file &&
+   git commit -m Message. <&-
+'
+
 test_expect_success 'check for a bug in the regex routines' '
# if this test fails, re-build git with NO_REGEX=1
test-regex
-- 
1.8.4.rc1.24.gd407a5c

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] git_mkstemps: improve test suite test

2013-08-02 Thread Dale R. Worley
Commit 52749 fixes a bug regarding testing the return of an open()
call for success/failure.  Improve the testsuite test for that fix by
removing the helper program 'test-close-fd-0' and replacing it with
the shell redirection '<&-'.  (The redirection is Posix, so it should
be portable.)

Signed-off-by: Dale Worley 
---

> From: Junio C Hamano 
> Date: Fri, 19 Jul 2013 07:29:47 -0700
> 
> The change itself looks good; care to write it up as a proper patch
> with a proposed log message?

My apologies for the delay; I've had to do some yak-shaving to learn
how to construct patches properly.  (I've written some clarifications
for Document/SubmittingPatches, which I will submit separately.)

Someone has gone ahead and made the code change, so all that remains
is to update the testsuite test by replacing the helper program
'test-close-fd-0' with the Posix shell redirection '<&-'.

Dale


 Makefile  |1 -
 test-close-fd-0.c |   14 --
 2 files changed, 0 insertions(+), 15 deletions(-)
 delete mode 100644 test-close-fd-0.c

diff --git a/Makefile b/Makefile
index 8ad40d4..3588ca1 100644
--- a/Makefile
+++ b/Makefile
@@ -557,7 +557,6 @@ X =
 PROGRAMS += $(patsubst %.o,git-%$X,$(PROGRAM_OBJS))
 
 TEST_PROGRAMS_NEED_X += test-chmtime
-TEST_PROGRAMS_NEED_X += test-close-fd-0
 TEST_PROGRAMS_NEED_X += test-ctype
 TEST_PROGRAMS_NEED_X += test-date
 TEST_PROGRAMS_NEED_X += test-delta
diff --git a/test-close-fd-0.c b/test-close-fd-0.c
deleted file mode 100644
index 3745c34..000
--- a/test-close-fd-0.c
+++ /dev/null
@@ -1,14 +0,0 @@
-#include 
-
-/* Close file descriptor 0 (which is standard-input), then execute the
- * remainder of the command line as a command. */
-
-int main(int argc, char **argv)
-{
-   /* Close fd 0. */
-   close(0);
-   /* Execute the requested command. */
-   execvp(argv[1], &argv[1]);
-   /* If execve() failed, return an error. */
-   return 1;
-}
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Making a patch: "git format-patch" does not produce the documented format

2013-08-02 Thread Dale R. Worley
> From: John Keeping 
> 
> git-format-patch(1) says:
> 
> By default, the subject of a single patch is "[PATCH] " followed
> by the concatenation of lines from the commit message up to the
> first blank line (see the DISCUSSION section of git-commit(1)).
> 
> I think that accurately describes what you're seeing.  The referenced
> DISCUSSION section describes how to write a commit message that is
> formatted in a suitable way, with a short first subject line and then a
> blank line before the body of the message.

Thanks for the confirmation.  I've figured out what is going wrong:
Documentation/SubmittingPatches says:

The first line of the commit message should be a short description (50
characters is the soft limit, see DISCUSSION in git-commit(1)), and
should skip the full stop.

What it *doesn't* say is that the second line of the commit message
should be empty -- precisely so that git format-patch turns the first
line into the Subject: but does not merge the remainder of the commit
message (the "body") into the Subject: line.  Now that I know to look
for this, I can see that the commit messages in the Git repository
show this pattern.

I'm preparing some clarifications of SubmittingPatches to explain
things that a new person (e.g., me) would not know.  To fix this
issue, I am inserting:

This first line should be a separate paragraph, that is, it should be
followed by an empty line, which is then followed by the body of the
commit message.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Difficulty adding a symbolic link, part 3

2013-08-01 Thread Dale R. Worley
> From: Duy Nguyen 

> > Can someone give me advice on what this code *should* do?
> 
> It does as the function name says: given cwd, a prefix (i.e. a
> relative path with no ".." components) and a path relative to
> cwd+prefix, convert 'path' to something relative to cwd. In the
> simplest case, prepending the prefix to 'path' is enough. cwd is also
> get_git_work_tree().
> 
> I agree with you that this code should not resolve anything in the
> components after 'cwd', after rebasing the path to 'cwd' (not just the
> final component). Not sure how to do it correctly though because we do
> need to resolve symlinks before cwd. Maybe a new variant of real_path
> that stops at 'cwd'?
> 
> We may also have problems with resolve symlinks before cwd when 'path'
> is relative, as normalize_path_copy() does not resolve symlinks. We
> just found out emacs has this bug [1] but did not realize we also have
> one :-P.

Thanks for the detailed information.  It seems to me that the minimum
needed change is that the handling of relative and absolute paths
should be made consistent.

> [1] http://thread.gmane.org/gmane.comp.version-control.git/231268

That problem isn't so much a matter of not resolving symlinks but
rather the question of what ".." means.  In the case shown in that
message, the current directory *is* {topdir}/z/b, but it was entered
with "cd {topdir}/b" -- and the shell records the latter as the value
of $PWD.  (Actually, the bash shell can handle symbolic-linked
directories *either* way, depending on whether it is given the "-P"
option.)

When Emacs is given the file name "../a/file", does the ".." mean to
go up one directory *textually* in the pathspec based on $PWD

"{topdir}/b/../a/file" -> "{topdir}/a/file" (which does not exist)

or according to the directory linkage on the disk (where the ".."
entry in the current directory goes to the directory named {topdir}/z,
which does contain a file a/file.  It looks like Emacs uses the first
method.

The decision of which implementation to use depends on the semantics
you *want*.  As I noted, bash allows the user to choose *either*
interpretation, which shows that both semantics are considered valid
(by some people).

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Difficulty adding a symbolic link, part 3

2013-08-01 Thread Dale R. Worley
> From: Duy Nguyen 

> > With the above change, the test suite runs with zero failures, so it
> > doesn't affect any common Git usage.
> 
> It means the test suite is incomplete. As you can see, the commit
> introducing this change does not come with a test case to catch people
> changing this.

Who should be blamed for omitting the test?

> > Can someone give me advice on what this code *should* do?
> 
> It does as the function name says: given cwd, a prefix (i.e. a
> relative path with no ".." components) and a path relative to
> cwd+prefix, convert 'path' to something relative to cwd. In the
> simplest case, prepending the prefix to 'path' is enough. cwd is also
> get_git_work_tree().

But as you can see, the exact behavior that the function is intended
to exhibit regarding symlinks is not clear from the function name;
there should have been a real explanation in the comment above the
function.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Making a patch: "git format-patch" does not produce the documented format

2013-07-31 Thread Dale R. Worley
I'm working on writing a patch, but I'm running into a problem.  The
patch itself is from this commit:

$ git log -1
commit 07a25537909dd277426818a39d9bc4235e755383
Author: Dale Worley 
Date:   Thu Jul 18 18:43:12 2013 -0400

open() returns -1 on failure, and indeed 0 is a possible success value
if the user closed stdin in our process.  Fix the test.
$ 

But the output of "git format-patch" is:

From 07a25537909dd277426818a39d9bc4235e755383 Mon Sep 17 00:00:00 2001
From: Dale Worley 
Date: Thu, 18 Jul 2013 18:43:12 -0400
Subject: [PATCH] open() returns -1 on failure, and indeed 0 is a possible
 success value if the user closed stdin in our process.  Fix
 the test.

---
 t/t0070-fundamental.sh |7 +++
 wrapper.c  |2 +-
 2 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/t/t0070-fundamental.sh b/t/t0070-fundamental.sh
index 986b2a8..d427f3a 100755
--- a/t/t0070-fundamental.sh
+++ b/t/t0070-fundamental.sh
@@ -25,6 +25,13 @@ test_expect_success POSIXPERM,SANITY 'mktemp to 
unwritable directory prints file
grep "cannotwrite/test" err
[...]

Notice that the whole commit message has been formatted as if it is
part of the Subject line, and the line breaks in the commit message
have been refilled.

The file Documentation/SubmittingPatches says that "git format-patch"
produces patches in the best format, but the manual page shows an
example more like this:

From 8f72bad1baf19a53459661343e21d6491c3908d3 Mon Sep 17 00:00:00 2001
From: Tony Luck 
Date: Tue, 13 Jul 2010 11:42:54 -0700
Subject: [PATCH] Put ia64 config files on the
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

arch/arm config files were slimmed down using a python script
(See commit c2330e286f68f1c408b4aa6515ba49d57f05beae comment)
[...]

That is, the first line of the commit message is in the Subject and
the remaining lines are in the message body.  As far as I can tell,
that's what SubmittingPatches prescribes.  And that is what I see in
the Git mailing list on vger.

(This is with git 1.8.3.3.)

Exactly how should the commit message be inserted into the patch
e-mail?  What needs to be updated so the code is consistent with the
documentation?

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Difficulty adding a symbolic link, part 3

2013-07-31 Thread Dale R. Worley
I've run into a problem (with Git 1.8.3.3) where I cannot add a
symbolic link (as such) to the repository *if* its path is given
absolutely; instead Git adds the file the symbolic link points to.
(If I give the path relatively, Git does what I expect, that is, adds
the symbolic link.)

I've written a test script that shows the problem and included it
below.

I would not expect *how* a path is presented to Git to change how Git
processes the path.  In the test case, I would expect "/bin/awk" and
"../../bin/awk" to produce the same effect when used as arguments to
"git add".

What is going on in the code is this:  In "git add", all paths are
normalized by the function prefix_path_gently() in abspath.c.  That
function removes symbolic links from the pathspec *only if* it is
absolute, as shown in the first few lines of the function:

 static char *prefix_path_gently(const char *prefix, int len, const char *path)
 {
 const char *orig = path;
 char *sanitized;
 if (is_absolute_path(orig)) {
-const char *temp = real_path(path);
+const char *temp = absolute_path(path);
 sanitized = xmalloc(len + strlen(temp) + 1);
 strcpy(sanitized, temp);
 } else {

real_path() is specified to remove symbolic links.  As shown, I've
replaced real_path() with absolute_path(), based on the comment at the
top of real_path():

/*
 * Return the real path (i.e., absolute path, with symlinks resolved
 * and extra slashes removed) equivalent to the specified path.  (If
 * you want an absolute path but don't mind links, use
 * absolute_path().)  The return value is a pointer to a static
 * buffer.
 *

If I replace real_path() with absolute_path() as shown, the problem I
am testing for disappears.

With the above change, the test suite runs with zero failures, so it
doesn't affect any common Git usage.

But I don't know enough about the internal architecture of Git to know
that my change is correct in all cases.  I'm almost certain that the
normalization process for pathspecs should *not* normalize a final
component that is a symbolic link.  But I would expect it would be
desirable to normalize non-final components that are symbolic links.
On the other hand, that might not matter.

Can someone give me advice on what this code *should* do?

I believe I can prepare a proper test for the test suite for this, so
once I know what the code change should be, I can prepare a proper
patch for it.

Dale
--  
Here's a test case for adding a symbolic link.  This test exploits the
fact that on my system, /bin/awk is a symbolic link to "gawk".  As you
can see, the behavior of Git differs if the link's path is given to
"git add" as an absolute path or a relative path.

Here is the test script:
--
#! /bin/bash

# Illustrates a problem with applying "git add" to a symbolic link.

set -x

# To be run from a directory one step below the root directory.  E.g.,
# "/tmp".
# On this system, /bin/awk is a symbolic link to "gawk", which
# means /tmp/gawk.

# Show the Git version.
git version

# Make a test directory and cd to it.
DIR=temp.$$
mkdir $DIR
cd $DIR

# Create a Git repository.
git init
# Set the worktree to be /
git config core.worktree /
# Create an empty commit.
git commit --allow-empty -m Empty.

# Add the symbolic link using its absolute name.
ABSOLUTE=/bin/awk
ls -l $ABSOLUTE
git add $ABSOLUTE
# Notice that the target of the link is added, not the link itself.
git status -uno

# Reset the index.
git reset

# Add the symbolic link using its relative name.
# Remember that we are two directory levels below the root directory now.
RELATIVE=../..$ABSOLUTE
ls -l $RELATIVE
git add $RELATIVE
# Notice that now the link itself is added.
git status -uno
--
Here is sample output of the script:
--
+ git version
git version 1.8.3.3.756.g07a2553.dirty
+ DIR=temp.22366
+ mkdir temp.22366
+ cd temp.22366
+ git init
Initialized empty Git repository in /git-add-link/temp.22366/.git/
+ git config core.worktree /
+ git commit --allow-empty -m Empty.
[master (root-commit) fb232e5] Empty.
+ ABSOLUTE=/bin/awk
+ ls -l /bin/awk
lrwxrwxrwx. 1 root root 4 Nov  2  2012 /bin/awk -> gawk
+ git add /bin/awk
+ git status -uno
# On branch master
# Changes to be committed:
#   (use "git reset HEAD ..." to unstage)
#
#   new file:   ../../bin/gawk
#
# Untracked files not listed (use -u option to show untracked files)
+ git reset
+ RELATIVE=../../bin/awk
+ ls -l ../../bin/awk
lrwxrwxrwx. 1 root root 4 Nov  2  2012 ../../bin/awk -> gawk
+ git add ../../bin/awk
+ git status -uno
# On branch master
# Changes to be committed:
#   (use "git reset HEAD ..." to unstage)
#
#   new file:   ../../bin/awk
#
# Untracked 

Re: [PATCH 1/2] git_mkstemps: correctly test return value of open()

2013-07-18 Thread Dale R. Worley
> From: Junio C Hamano 

> > +test_expect_success 'git_mkstemps_mode does not fail if fd 0 is not open' '
> > +   git init &&
> > +   echo Test. >test-file &&
> > +   git add test-file &&
> > +   git commit -m Message. <&-
> > +'
> > +
> 
> Yup.  I wonder how it would fail without the fix, though ;-)

Eh, what?  You could run it and see.  The test script system will just
say "not ok" for this test.  If you execute those commands from a
shell, you see:

 $ git init
Initialized empty Git repository in /common/not-replicated/worley/temp/1/.git/
 $ echo Test. >test-file
 $ git add test-file
 $ git commit -m Message. <&-
error: unable to create temporary sha1 filename : No such file or directory

error: Error building trees
 $ 

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] git_mkstemps: correctly test return value of open()

2013-07-18 Thread Dale R. Worley
> From: Junio C Hamano 
> 
> That's just a plain-vanilla part of POSIX shell behaviour, no?
> 
> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_07_05

"Close standard input" is so weird I never thought it was Posix.  In
that case, we can eliminate the C helper program:

diff --git a/t/t0070-fundamental.sh b/t/t0070-fundamental.sh
index 986b2a8..d427f3a 100755
--- a/t/t0070-fundamental.sh
+++ b/t/t0070-fundamental.sh
@@ -25,6 +25,13 @@ test_expect_success POSIXPERM,SANITY 'mktemp to unwritable 
directory prints file
grep "cannotwrite/test" err
 '
 
+test_expect_success 'git_mkstemps_mode does not fail if fd 0 is not open' '
+   git init &&
+   echo Test. >test-file &&
+   git add test-file &&
+   git commit -m Message. <&-
+'
+
 test_expect_success 'check for a bug in the regex routines' '
# if this test fails, re-build git with NO_REGEX=1
test-regex
diff --git a/wrapper.c b/wrapper.c
index dd7ecbb..6a015de 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -322,7 +322,7 @@ int git_mkstemps_mode(char *pattern, int suffix_len, int 
mode)
template[5] = letters[v % num_letters]; v /= num_letters;
 
fd = open(pattern, O_CREAT | O_EXCL | O_RDWR, mode);
-   if (fd > 0)
+   if (fd >= 0)
return fd;
/*
 * Fatal error (EPERM, ENOSPC etc).

Is this a sensible place to put this test?

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] git_mkstemps: correctly test return value of open()

2013-07-18 Thread Dale R. Worley
I've been looking into writing a proper test for this patch.  My first
attempt tests the symptom that was seen initially, that "git commit"
fails if fd 0 is closed.

One problem is how to arrange for fd 0 to be closed.  I could use the
bash redirection "<&-", but I think you want to be more portable than
that.  This version uses execvp() inside a small C program, and
execvp() is a Posix function.

I've tested that this test does what it should:  If you remove the
fix, "fd >= 0", the test fails.  If you then remove "test-close-fd-0"
from before "git init" in the test, the test is nullified and succeeds
again.

Here is the diff.  What do people think of it?

diff --git a/Makefile b/Makefile
index 0600eb4..6b410f5 100644
--- a/Makefile
+++ b/Makefile
@@ -557,6 +557,7 @@ X =
 PROGRAMS += $(patsubst %.o,git-%$X,$(PROGRAM_OBJS))
 
 TEST_PROGRAMS_NEED_X += test-chmtime
+TEST_PROGRAMS_NEED_X += test-close-fd-0
 TEST_PROGRAMS_NEED_X += test-ctype
 TEST_PROGRAMS_NEED_X += test-date
 TEST_PROGRAMS_NEED_X += test-delta
diff --git a/t/t0070-fundamental.sh b/t/t0070-fundamental.sh
index 986b2a8..6a31103 100755
--- a/t/t0070-fundamental.sh
+++ b/t/t0070-fundamental.sh
@@ -25,6 +25,13 @@ test_expect_success POSIXPERM,SANITY 'mktemp to unwritable 
directory prints file
grep "cannotwrite/test" err
 '
 
+test_expect_success 'git_mkstemps_mode does not fail if fd 0 is not open' '
+   git init &&
+   echo Test. >test-file &&
+   git add test-file &&
+   test-close-fd-0 git commit -m Message.
+'
+
 test_expect_success 'check for a bug in the regex routines' '
# if this test fails, re-build git with NO_REGEX=1
test-regex
diff --git a/test-close-fd-0.c b/test-close-fd-0.c
new file mode 100644
index 000..3745c34
--- /dev/null
+++ b/test-close-fd-0.c
@@ -0,0 +1,14 @@
+#include 
+
+/* Close file descriptor 0 (which is standard-input), then execute the
+ * remainder of the command line as a command. */
+
+int main(int argc, char **argv)
+{
+   /* Close fd 0. */
+   close(0);
+   /* Execute the requested command. */
+   execvp(argv[1], &argv[1]);
+   /* If execve() failed, return an error. */
+   return 1;
+}
diff --git a/wrapper.c b/wrapper.c
index dd7ecbb..6a015de 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -322,7 +322,7 @@ int git_mkstemps_mode(char *pattern, int suffix_len, int 
mode)
template[5] = letters[v % num_letters]; v /= num_letters;
 
fd = open(pattern, O_CREAT | O_EXCL | O_RDWR, mode);
-   if (fd > 0)
+   if (fd >= 0)
return fd;
/*
 * Fatal error (EPERM, ENOSPC etc).

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Bug: Failure if stdin is closed.

2013-07-11 Thread Dale R. Worley
(The original problem and the discussion that ensued is on the
git-users mailing list:
https://groups.google.com/forum/#!topic/git-users/lNQ7Cn35EqA)

"git commit" (and probably other operations) fail if standard input
(fd 0) is closed when git starts.  A simple test case follows.  (The
execution is version 1.7.7.6, but the code listed below is from a very
recent commit.)


$ git --version
git version 1.7.7.6
$ git status
# On branch master
nothing to commit (working directory clean)
$ echo This is a test >
$ git status
# On branch master
# Untracked files:
#   (use "git add ..." to include in what will be committed)
#
#   
nothing added to commit but untracked files present (use "git add" to track)
$ git add 
$ # The notation "0<&-" means "close standard input (fd 0) in the process that
$ # executes this command.  It may be specific to the bash shell.
$ git commit -m  0<&-
error: unable to create temporary sha1 filename : No such file or directory

error: Error building trees
$ git status
# On branch master
# Changes to be committed:
#   (use "git reset HEAD ..." to unstage)
#
#   new file:   
#
$ git commit -m 
[master 54c146c] 
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 
$ git status
# On branch master
# Your branch is ahead of 'origin/master' by 1 commit.
#
nothing to commit (working directory clean)
$ 


The fundamental error is that git_mkstemps_mode() in wrapper.c assumes
that if open() is successful, its return value must be >0.  That is
not true, because if fd 0 is closed, then open() can successfully
return 0.  I have not tested it, but it looks like the fix is:

 int git_mkstemps_mode(char *pattern, int suffix_len, int mode)
 {
 static const char letters[] =
 "abcdefghijklmnopqrstuvwxyz"
 "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
 "0123456789";
 static const int num_letters = 62;
 uint64_t value;
 struct timeval tv;
 char *template;
 size_t len;
 int fd, count;

 len = strlen(pattern);

 if (len < 6 + suffix_len) {
 errno = EINVAL;
 return -1;
 }

 if (strncmp(&pattern[len - 6 - suffix_len], "XX", 6)) {
 errno = EINVAL;
 return -1;
 }

 /*
  * Replace pattern's XX characters with randomness.
  * Try TMP_MAX different filenames.
  */
 gettimeofday(&tv, NULL);
 value = ((size_t)(tv.tv_usec << 16)) ^ tv.tv_sec ^ getpid();
 template = &pattern[len - 6 - suffix_len];
 for (count = 0; count < TMP_MAX; ++count) {
 uint64_t v = value;
 /* Fill in the random bits. */
 template[0] = letters[v % num_letters]; v /= num_letters;
 template[1] = letters[v % num_letters]; v /= num_letters;
 template[2] = letters[v % num_letters]; v /= num_letters;
 template[3] = letters[v % num_letters]; v /= num_letters;
 template[4] = letters[v % num_letters]; v /= num_letters;
 template[5] = letters[v % num_letters]; v /= num_letters;

 fd = open(pattern, O_CREAT | O_EXCL | O_RDWR, mode);
-if (fd > 0)
+if (fd >= 0)
 return fd;
 /*
  * Fatal error (EPERM, ENOSPC etc).
  * It doesn't make sense to loop.
  */
 if (errno != EEXIST)
 break;
 /*
  * This is a random value.  It is only necessary that
  * the next TMP_MAX values generated by adding  to
  * VALUE are different with (module 2^32).
  */
 value += ;
 }
 /* We return the null string if we can't find a unique file name.  */
 pattern[0] = '\0';
 return -1;
 }


However, when looking at the code, I noticed that few of the functions
have comments describing what they do, and none describe their input
and output values.  In particular, there are no comments specifying
what the error return values are.  This is appalling for a supposedly
professional-quality project!

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Difficulty adding a symbolic link, part 2

2013-06-24 Thread Dale R. Worley
Here's a slightly simpler test case for adding a symbolic link.  This
test exploits the fact that on my system, /bin/awk is a symbolic link
to "gawk".  As you can see, the behavior of Git differs if the link's
path is given to "git add" as an absolute path or a relative path.

Here is the test script:
--
#! /bin/bash

# Illustrates a problem with applying "git add" to a symbolic link.

set -x

# To be run from a directory one step below the root directory.  E.g.,
# "/git-add-link".
# Exploits the fact that /bin/awk is a symbolic link to "gawk".

# Show the Git version.
git version

# Make a test directory and cd to it.
DIR=temp.$$
mkdir $DIR
cd $DIR

# Create a Git repository.
git init
# Set the worktree to be /
git config core.worktree /
# Create an empty commit.
git commit --allow-empty -m Empty.

# Add the symbolic link using its absolute name.
ABSOLUTE=/bin/awk
ls -l $ABSOLUTE
git add $ABSOLUTE
# Notice that the target of the link is added, not the link itself.
git status -uno

# Reset the index.
git reset

# Add the symbolic link using its relative name.
# Remember that we are two directory levels below the root directory now.
RELATIVE=../..$ABSOLUTE
ls -l $RELATIVE
git add $RELATIVE
# Notice that now the link itself is added.
git status -uno
--
Here is the output of the script:
--
+ git version
git version 1.7.7.6
+ DIR=temp.22366
+ mkdir temp.22366
+ cd temp.22366
+ git init
Initialized empty Git repository in /git-add-link/temp.22366/.git/
+ git config core.worktree /
+ git commit --allow-empty -m Empty.
[master (root-commit) fb232e5] Empty.
+ ABSOLUTE=/bin/awk
+ ls -l /bin/awk
lrwxrwxrwx. 1 root root 4 Nov  2  2012 /bin/awk -> gawk
+ git add /bin/awk
+ git status -uno
# On branch master
# Changes to be committed:
#   (use "git reset HEAD ..." to unstage)
#
#   new file:   ../../bin/gawk
#
# Untracked files not listed (use -u option to show untracked files)
+ git reset
+ RELATIVE=../../bin/awk
+ ls -l ../../bin/awk
lrwxrwxrwx. 1 root root 4 Nov  2  2012 ../../bin/awk -> gawk
+ git add ../../bin/awk
+ git status -uno
# On branch master
# Changes to be committed:
#   (use "git reset HEAD ..." to unstage)
#
#   new file:   ../../bin/awk
#
# Untracked files not listed (use -u option to show untracked files)
--
I can't see any principle of operation of Git that would cause "git
add /bin/awk" and "git add ../../bin/awk" to be handled differently.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Difficulty adding a symbolic link

2013-06-14 Thread Dale R. Worley
I'm having a problem with "git add" in version 1.7.7.6.

The situation is that I have a repository that is contained in a
second-level directory, a sub-sub-directory of "/".  The core.worktree
of the repository is "/", so the working directory is the entire file
tree.  I want this repository to track selected files, which I add
with "git add".

The difficulty is that "git add" seems to not add specific files,
depending on how they are specified as arguments to "git add".  In
particular, /dev/dvd (which is a symbolic link) can be added with "git
add ../../dev/dvd" but not with "git add /dev/dvd".  On the other
hand, /etc/hosts (an ordinary file) can be added with either "git add
/etc/hosts" or "git add ../../etc/hosts".

To demonstrate the problem, I've written a script.  A typical
execution of the script is as follows.  The lines at the left margin
start with "$", and are the lines of the script.  The lines indented 4
spaces start with "+", and are the "simple commands" as they are
executed by the shell, showing the values substituted for the shell
variables.  The lines indented 8 spaces are the output of the various
commands.

Someone suggested that the problem may be triggered by the fact that
/dev is in a different filesystem than / and /etc.  I added a third
section to the script by creating a symbolic link from /etc/hosts.link
to /etc/hosts, which is thus in the same filesystem as / and the
repository.  Git handles it as expected.

Any help with this would be appreciated.

Dale

$ set -x
$ 
$ # Show git version.
$ git version
+ git version
git version 1.7.7.6
$ 
$ # To be run in a directory that is contained in /.
$ pwd
+ pwd
/git-add-issue
$ 
$ # Make a test directory.
$ DIR=temp.$$
+ DIR=temp.10714
$ mkdir $DIR
+ mkdir temp.10714
$ cd $DIR
+ cd temp.10714
++ pwd
$ DIR_ABSOLUTE=$( pwd )
+ DIR_ABSOLUTE=/git-add-issue/temp.10714
$ 
$ # Create a new repository.
$ git init
+ git init
Initialized empty Git repository in /git-add-issue/temp.10714/.git/
$ # Set the worktree to be /
$ git config core.worktree /
+ git config core.worktree /
$ # Create empty commit.
$ git commit --allow-empty -m Empty.
+ git commit --allow-empty -m Empty.
[master (root-commit) 62d86cb] Empty.
$ 
$ # Show empty status
$ git status -uno
+ git status -uno
# On branch master
nothing to commit (use -u to show untracked files)
$ 
$ # First test:  /dev/dvd, which is a symbolic link.
$ 
$ # Try to add /dev/dvd
$ ABSOLUTE_NAME=/dev/dvd
+ ABSOLUTE_NAME=/dev/dvd
$ ll $ABSOLUTE_NAME
+ ls -al /dev/dvd
lrwxrwxrwx. 1 root root 9 Jun 12 22:23 /dev/dvd -> /dev/dvd1
$ git add $ABSOLUTE_NAME
+ git add /dev/dvd
$ git status -uno
+ git status -uno
# On branch master
nothing to commit (use -u to show untracked files)
$ git reset
+ git reset
$ 
$ # Try with alternative name ../../dev/dvd
$ RELATIVE_NAME=../../dev/dvd
+ RELATIVE_NAME=../../dev/dvd
$ ll $RELATIVE_NAME
+ ls -al ../../dev/dvd
lrwxrwxrwx. 1 root root 9 Jun 12 22:23 ../../dev/dvd -> /dev/dvd1
$ git add $RELATIVE_NAME
+ git add ../../dev/dvd
$ git status -uno
+ git status -uno
# On branch master
# Changes to be committed:
#   (use "git reset HEAD ..." to unstage)
#
#   new file:   ../../dev/dvd
#
# Untracked files not listed (use -u option to show untracked files)
$ git reset
+ git reset
$ 
$ # Second test:  /etc/hosts, which is an ordinary file.
$ 
$ # Try to add /etc/hosts
$ ABSOLUTE_NAME=/etc/hosts
+ ABSOLUTE_NAME=/etc/hosts
$ ll $ABSOLUTE_NAME
+ ls -al /etc/hosts
-rw-r--r--. 1 root root 222 Nov  4  2012 /etc/hosts
$ git add $ABSOLUTE_NAME
+ git add /etc/hosts
$ git status -uno
+ git status -uno
# On branch master
# Changes to be committed:
#   (use "git reset HEAD ..." to unstage)
#
#   new file:   ../../etc/hosts
#
# Untracked files not listed (use -u option to show untracked files)
$ git reset
+ git reset
$ 
$ # Try with alternative name ../../etc/hosts
$ RELATIVE_NAME=../../etc/hosts
+ RELATIVE_NAME=../../etc/hosts
$ ll $RELATIVE_NAME
+ ls -al ../../etc/hosts
-rw-r--r--. 1 root root 222 Nov  4  2012 ../../etc/hosts
$ git add $RELATIVE_NAME
+ git add ../../etc/hosts
$ git status -uno
+ git status -uno
# On branch master
# Changes to be committed:
#   (use "git reset HEAD ..." to unstage)
#
#   new file:   ../../etc/hosts
#
# Untracked files not listed (use -u option to show untracked files)
$ git reset
+ git reset
$ 
$ # Third test:  /etc/hosts.link, which is a link to /etc/hosts
$ 
$ # Try to add /etc/hosts.link
$ ABSOLUTE_NAME=/etc/hosts.link
+ ABSOLUTE_NAME=/etc/hosts.link
$ ll $ABSOLUTE_NAME
+ ls -al /etc/hosts.link
lrwxrwxrwx. 1 root root 5 Jun 14 12:04 /etc/hosts.link -

[PATCHv2] git-submodule.txt: Clarify 'init' and 'add' subcommands.

2013-05-15 Thread Dale R. Worley
Describe how 'add' sets the submodule's logical name, which is used in
the configuration entry names.

Clarify that 'init' only sets up the configuration entries for
submodules that have already been added elsewhere.  Describe that
 arguments limit the submodules that are configured.

Signed-off-by: Dale Worley 
---
This patch seems to have all the features that we have discussed:

- Describes how 'add' selects the submodule logical name, including
  the effect of --name.  (My first patch was on a version of Git that
  did not support --name, so I didn't know of it.)

- Corrects description of 'init' to clarify that its behavior is
  driven by the gitlinks recorded in the index, rather than implying
  it is driven by the contents of .gitmodules.

- Describes 'init' behavior to be driven by the index, rather than my
  previous incorrect use of "file tree".  (Of course, gitlinks aren't
  visible in the file tree.)

- Updated text for 'init' is shorter and less technical than my
  previous patch.

- Since "(which were added and committed elsewhere)" is stated in the
  first sentence, I've removed the later sentence explaining that
  submodules must be added before they can be inited.

- Explains the effect of  arguments to 'init' subcommand.

 Documentation/git-submodule.txt |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index c99d795..83235c0 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -76,6 +76,8 @@ argument  is the relative location for the cloned 
submodule
 to exist in the superproject. If  is not given, the
 "humanish" part of the source repository is used ("repo" for
 "/path/to/repo.git" and "foo" for "host.xz:foo/.git").
+The  is also used as the submodule's logical name in its
+configuration entries unless `--name` is used to specify a logical name.
 +
  is the URL of the new submodule's origin repository.
 This may be either an absolute URL, or (if it begins with ./
@@ -123,8 +125,10 @@ linkgit:git-status[1] and linkgit:git-diff[1] will provide 
that information
 too (and can also report changes to a submodule's work tree).
 
 init::
-   Initialize the submodules, i.e. register each submodule name
-   and url found in .gitmodules into .git/config.
+   Initialize the submodules recorded in the index (which were
+   added and committed elsewhere) by copying submodule
+   names and urls from .gitmodules to .git/config.
+   Optional  arguments limit which submodules will be initialized.
It will also copy the value of `submodule.$name.update` into
.git/config.
The key used in .git/config is `submodule.$name.url`.
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2] CodingGuidelines: make it clear which files in Documentation/ are the sources

2013-05-09 Thread Dale R. Worley
> From: Junio C Hamano 
> 
> > From e87227498ef3d50dc20584c24c53071cce63c555 Mon Sep 17 00:00:00 2001
> > From: Dale Worley 
> > Date: Tue, 7 May 2013 13:39:46 -0400
> > Subject: [PATCH] CodingGuidelines:  make it clear which files in
> >  Documentation/ are the sources
> 
> These five lines are present in the output of the format-patch only
> to help you fill in the MUA's mail header (instead of typing the
> subject, you can cut and paste from here, for example); after you
> are done with the MUA headers, remove them and do not leave them in
> the body of the message.

I was thinking that that was the intention.  But SubmittingPatches
makes it sound like the *entire* output of "git format-patch" is
intended to be the *body* of the e-mail message:

"git format-patch" command follows the best current practice to
format the body of an e-mail message.  At the beginning of the
patch should come your commit message, ending with the
Signed-off-by: lines, and a line that consists of three dashes,
followed by the diffstat information and the patch itself.  If
you are forwarding a patch from somebody else, optionally, at
the beginning of the e-mail message just before the commit
message starts, you can put a "From: " line to name that person.

In particular, while the line "From e87227..." appears to be the
typical Unix mailbox start-of-message line, it wasn't clear to me
whether it was supposed to be sent or not:  If I do the obvious
cut-and-paste of the "From:", "Date:", and "Subject:" lines into the
headers of the e-mail message and copy the following lines into the
body, the "From e87227..." line will have no place to go.  And perhaps
it is an important part of the patch, since "git format-patch" outputs
it?

If you could give me some guidance in regard to the "From e87227..."
line, that would be helpful.  (I suppose I should try to improve that
paragraph of SubmittingPatches to prevent anyone else from having the
same misunderstanding.)

> > Signed-off-by: Dale R. Worley 
> 
> The title looks a bit too long.  For a small and obviously correct
> patch like this, I do not think you would need anything in the log
> message, some of what you wrote below the three-dash line may
> deserve to be said here.  Perhaps:
> 
> Subject: [PATCH] CodingGuidelines: Documentation/*.txt are the sources
> 
> People not familiar with AsciiDoc may not realize they are
> supposed to update *.txt files and not *.html/*.1 files when
> preparing patches to the project.
> 
> But it invites a question.  Why do people patching Git not to know *.txt
> are the sources in the first place?  Generated *.html files are not
> even tracked.

Now that you mention it, I see the mistake I have made.  I had
forgotten that I did a build of Git from that working copy, and so the
*.html and *.1 files might be generated files rather than tracked
files.  And I hadn't yet learned that "git ls-files git-submodule.*"
would do about the same thing that "svn status git-submodule.*" does,
that is, tell the natures of the various files that are present.

> >  Documentation/CodingGuidelines |4 +++-
> >  1 files changed, 3 insertions(+), 1 deletions(-)
> >
> > diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
> > index 7e4d571..b8eef7c 100644
> > --- a/Documentation/CodingGuidelines
> > +++ b/Documentation/CodingGuidelines
> > @@ -238,7 +238,9 @@ For Python scripts:
> >  Writing Documentation:
> >  
> >   Most (if not all) of the documentation pages are written in AsciiDoc
> > - and processed into HTML output and manpages.
> > + and processed into HTML output and manpages.  This means that the *.txt
> > + files in this directory are usually the sources from which the
> > + corresponding *.html, *.1, and *.xml files are generated.
> 
> Whenever you see somebody writing "This means that" or "In other
> words", it is a good habit to ask if the existing text can be
> improved so that it does not need such a follow-up clarification.

That's true.  But it also appears to be a less risky approach to
someone who is editing the text and does not know its history.

> Most (if not all) of the documentation pages are written in the
> AsciiDoc format in *.txt files (e.g. Documentation/git.txt), and
> processed into HTML and manpages (e.g. git.html and git.1 in the
> same directory).
> 
> >   Every user-visible change should be reflected in the documentation.
> >   The same general rule as for code applies -- imitate the existing

Thank you for your help.  I will revise my patch (and learning
exercise) and resubmit it.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2] CodingGuidelines: make it clear which files in Documentation/ are the sources

2013-05-08 Thread Dale R. Worley
>From e87227498ef3d50dc20584c24c53071cce63c555 Mon Sep 17 00:00:00 2001
From: Dale Worley 
Date: Tue, 7 May 2013 13:39:46 -0400
Subject: [PATCH] CodingGuidelines:  make it clear which files in
 Documentation/ are the sources

Signed-off-by: Dale R. Worley 
---
While learning about making a documentation patch, I noticed that
Documentation/CodingGuideles isn't as clear as it could be regarding
how to edit the documentation.  In particular, it says "Most (if not
all) of the documentation pages are written in AsciiDoc - and
processed into HTML output and manpages." without really specifying
the details for those of us who aren't familiar with AsciiDoc.  So I
added a sentence stating explicitly which files are the sources and
which are derived.

It's also a test for submitting a patch.  I've read SubmittingPatches
again, more carefully, and have corrected some problem with my
previous message.

 Documentation/CodingGuidelines |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index 7e4d571..b8eef7c 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -238,7 +238,9 @@ For Python scripts:
 Writing Documentation:
 
  Most (if not all) of the documentation pages are written in AsciiDoc
- and processed into HTML output and manpages.
+ and processed into HTML output and manpages.  This means that the *.txt
+ files in this directory are usually the sources from which the
+ corresponding *.html, *.1, and *.xml files are generated.
 
  Every user-visible change should be reflected in the documentation.
  The same general rule as for code applies -- imitate the existing
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] CodingGuidelines: make it clear which files in Documentation are the sources

2013-05-07 Thread Dale R. Worley
While learning about making a documentation patch, I noticed that
Documentation/CodingGuideles isn't as clear as it could be regarding
how to edit the documentation.  In particular, it says "Most (if not
all) of the documentation pages are written in AsciiDoc - and
processed into HTML output and manpages." without really specifying
the details for those of us who aren't familiar with AsciiDoc.  So I
added a sentence stating explicitly which files are the sources and
which are derived.

It's also a test for submitting a patch.

Dale



>From e87227498ef3d50dc20584c24c53071cce63c555 Mon Sep 17 00:00:00 2001
From: Dale Worley 
Date: Tue, 7 May 2013 13:39:46 -0400
Subject: [PATCH] CodingGuidelines:  make it clear which files in
 Documentation are the sources

---
 Documentation/CodingGuidelines |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index 7e4d571..b8eef7c 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -238,7 +238,9 @@ For Python scripts:
 Writing Documentation:
 
  Most (if not all) of the documentation pages are written in AsciiDoc
- and processed into HTML output and manpages.
+ and processed into HTML output and manpages.  This means that the *.txt
+ files in this directory are usually the sources from which the
+ corresponding *.html, *.1, and *.xml files are generated.
 
  Every user-visible change should be reflected in the documentation.
  The same general rule as for code applies -- imitate the existing
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Suggestion for improving the manual page for "git submodule"

2013-05-02 Thread Dale R. Worley
Several people have made similar mistakes in beliving that "git
submodule init" can be used for adding submodules to a working
directory, whereas "git submodule add" is the command that should be
used.  That *is* documented at the top of the manual page for "git
submodule", but my error was enhanced by a subtle error in the
documentation of "init".

The text as it is written suggests that init's behavior is driven by
the contents of .submodules.  But in fact, its behavior is driven by
the existing gitlinks in the file tree, possibly limited by the 
arguments.  (Which is *why* "git submodule init" can't *add*
submodules; it only processes *existing* submodules.)

I would like to suggest that the manual page be updated to remove the
error in the description of the init subcommand, along with another
addition to tell the submodule logical name that is used by the "add"
subcommand:

--- man12013-04-26 12:02:16.752702146 -0400
+++ man32013-05-02 21:06:00.020353100 -0400
@@ -61,6 +61,8 @@
to exist in the superproject. If  is not given, the
"humanish" part of the source repository is used ("repo" for
"/path/to/repo.git" and "foo" for "host.xz:foo/.git").
+   The  is used as the submodule's logical name in its
+   configuration entries.
 
 is the URL of the new submodule’s origin repository.
This may be either an absolute URL, or (if it begins with ./ or
@@ -109,7 +111,9 @@
too (and can also report changes to a submodule’s work tree).
 
init
-   Initialize the submodules, i.e. register each submodule name and
+   Initialize the submodules, i.e. register each submodule for which
+   there is a gitlink recorded (or the specific gitlinks specified by
+...) by copying the name and
url found in .gitmodules into .git/config. The key used in
.git/config is submodule.$name.url. This command does not alter
existing information in .git/config. You can then customize the
@@ -118,6 +122,10 @@
submodule update --init without the explicit init step if you do
not intend to customize any submodule locations.
 
+   (Because init only operates on existing gitlinks, it cannot
+   be used to create submodules, regardless of the contents of
+   .gitmodules.  Use the add subcommand to create submodules.)
+
update
Update the registered submodules, i.e. clone missing submodules and
checkout the commit specified in the index of the containing

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What I want rebase to do

2013-03-07 Thread Dale R. Worley
> From: Thomas Rast 
> 
> wor...@alum.mit.edu (Dale R. Worley) writes:
> [...snip...]
> 
> Isn't that just a very long-winded way of restating what Junio said
> earlier:
> 
> > > It was suggested to make it apply the first-parent diff and record
> > > the result, I think.  If that were an acceptable approach (I didn't
> > > think about it through myself, though), that would automatically
> > > cover the evil-merge case as well.

Well, I believe what I said was a fleshed-out way of saying what I
*think* Junio said, but...

> You can fake that with something like
> 
> git rev-list --first-parent --reverse RANGE_TO_REBASE |
> while read rev; do
> if git rev-parse $rev^2 >/dev/null 2>&1; then
> git cherry-pick -n -m1 $rev
> git rev-parse $rev^2 >.git/MERGE_HEAD
> git commit -C$rev
> else
> git cherry-pick $rev
> fi
> done

This code doesn't do that.  I don't want something that rebases a
single thread of the current branch, I want something that rebases
*all* the commits between the head commit and the merge base.  Which
is what is illustrated in my message.

> [1]  If you don't get the sarcasm: that would amount to reinventing
> large parts of git-rebase.

Yes, that is the point of the exercise.

I've done a proof-of-concept implementation of what I want to see,
calling it git-rebase--merge-safe.  But I'm new here and likely that
is a pretty crude solution.  I suspect that a real implementation
could be done by inserting this logic into the framework of
git-filter-tree.  Following is git-rebase--merge-safe, and the script
I use to test it (and explore rebase problems).

Dale
--
git-rebase--merge-safe

#!/bin/bash

. git-sh-setup

prec=4

set -ex

# Ensure the work tree is clean.
require_clean_work_tree "rebase" "Please commit or stash them."

onto_name=$1
onto=$(git rev-parse --verify "${onto_name}^0") ||
die "Does not point to a valid commit: $1"

head_name=$( git symbolic-ref HEAD )
orig_head=$(git rev-parse --verify $head_name) ||
exit 1

echo onto=$onto
echo head_name=$head_name
echo orig_head=$orig_head

# Get the merge base, which is the root of the branch that we are rebasing.
# (For now, ignore the question of whether there is more than one merge base.)
mb=$(git merge-base "$onto" "$orig_head")
echo mb=$mb

# Get the list of commits to rebase, which is everything between $mb and
# $orig_head.
# Note that $mb is not included.
revisions=`git rev-list --reverse --ancestry-path $mb..$orig_head`
echo revisions=$revisions

# Set up the list mapping the commits on the original branch to the commits
# on the branch we are creating.
# Its format is ",old-hash1/new-hash1,old-hash2/new-hash2,...,".
# The initial value maps $mb to $onto.
map=",$mb/$onto,"

# Export these so git commit can see them.
export GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_AUTHOR_DATE

# Process each commit in forward topological order.
for cmt in $revisions
do
# Examine the commit to extract information we will need to reconstruct it.
# First parent of the commit that has a mapping, i.e., is part of the
# branch (and has thus been rebuilt already.
first_mapped_parent=
# The new commit that was made of $first_mapped_parent.
first_mapped_parent_mapped=
# List of -p options naming the parent commits, or their new commits if they
# are in the branch.
parents=
   # Dissect the old commit's data.
# Output the commit data into FD 3.
exec 3< <( git cat-file commit $cmt )

while read keyword rest <&3
do
case $keyword in
tree)
# Ignored
;;
parent)
# See if the parent is mapped, i.e., is in the
# original branch.
if [[ "$map" == *,$rest/* ]]
then
# This parent has been mapped.  Get the new commit.
parent_mapped=${map#*,$rest/}
parent_mapped=${parent_mapped%%,*}
if test -z "$first_mapped_parent"
then
first_mapped_parent=$rest
first_mapped_parent_mapped=$parent_mapped
fi
else
# This parent has not been mapped.
parent_mapped=$rest
fi
# $parent_mapped is a parent of the new commit.
parents="$parents -p $parent_mapped"
;;
author)
# Extract the information about the author.
GIT_AUTHOR_NAME="${rest%% <*}"
 

What I want rebase to do

2013-03-06 Thread Dale R. Worley
This is how I see what rebase should do:

The simple case for rebase starts from

   P---Q---R---S master
\
 A---B---C topic

Then "git checkout topic ; git rebase master" will change it to

   P---Q---R---S master
\
 A'--B'--C' topic

A' is created by a three-way merge that can be symbolized

Q-->S
|   |
v   v
A-->A'

That is, Q, applying the changes from Q to A and the changes from Q to
S, becomes A'.

After that

A-->A'
|   |
v   v
B-->B'
|   |
v   v
C-->C'

A more complex case is when there is a merge from an external source

   P---Q---R---S master
\
 A---M---C topic
/
---X

We want to produce

   P---Q---R---S master
\
 A'--M'--C' topic
/
---X

So we have to merge

Q-->S
|   |
v   v
A-->A'
|   |
v   v
M-->M'
|   |
v   v
C-->C'

Any "evil" changes in M will be in the changes A->M (along with the
changes introduced from X), and so they will be reincorporated in
A'->M'.  M' lists A' and X as its parents.  (And not M!)

If there is an internal merge in the topic branch, things look like
this

   P---Q---R---S master
\
 \   B
  \ / \
   A   M---D topic
\ /
 C

and we want to produce this

   P---Q---R---S master
\
 \B'
  \  /  \
   A'M'--D' topic
 \  /
  C'

Which can be done with these merges


Q-->S
|   |
v   v
A-->A'A-->A'
|   | |   | 
v   v v   v 
B-->B'C-->C'

There are two choices for constructing M' (which ought to produce the
same results under ordinary circumstances)

B-->B'C-->C'
|   | |   | 
v   v v   v 
M-->M'M-->M'

and finally

M-->M'
|   |
v   v
D-->D'

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


"git rebase" loses the additional changes in "evil" merges

2013-03-04 Thread Dale R. Worley
(git version 1.7.7.6)

I've been learning how to use Git.  While exploring "git rebase", I've
discovered that if the branch being rebased contains an "evil" merge,
that is, a merge which contains changes that are in addition to the
changes in any of the parent commits, the rebase operation will
silenty lose those additional changes.

I believe that this is a serious bug.

The problem is not that Git does not propagate the additional changes
(although I'd like it to do so), the problem is that the rebase
*silently loses* changes in the code.  It would be acceptable if Git
detected the evil merge, notified me of it, and forced me to manually
re-do it in order to finish the rebase.

Throughout the Git design (as in other SCMs), care has been taken to
not lose changes that the user has introduced.  In this situation,
that principle is violated.

I have a script that demonstrates this failure, if that would be
useful for testing purposes.

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug in "git log --graph -p -m" (version 1.7.7.6)

2013-02-06 Thread Dale R. Worley
> From: Matthieu Moy 
> 
> In any case, I can't reproduce with 1.8.1.2.526.gf51a757: I don't get
> undless output. On the other hand, I get a slightly misformatted output:
> 
> *   commit a393ed598e9fb11436f85bd58f1a38c82f2cadb7 (from 
> 2c1e6a36f4b712e914fac994463da7d0fdb2bc6d)
> |\  Merge: 2c1e6a3 33e70e7
> | | Author: Matthieu Moy 
> | | Date:   Tue Feb 5 22:05:33 2013 +0100
> | | 
> | | Commit S
> | | 
> | | diff --git a/file b/file
> | | index 6bb4d3e..afd2e75 100644
> | | --- a/file
> | | +++ b/file
> | | @@ -1,4 +1,5 @@
> | |  1
> | |  1a
> | |  2
> | | +2a
> | |  3
> | | 
> commit a393ed598e9fb11436f85bd58f1a38c82f2cadb7 (from 
> 33e70e70c0173d634826b998bdc304f93c0966b8)
> | | Merge: 2c1e6a3 33e70e7
> | | Author: Matthieu Moy 
> | | Date:   Tue Feb 5 22:05:33 2013 +0100
> 
> The second "commit" line (diff with second parent) doesn't have the
> "| |" prefix, I don't think this is intentional.

The second "commit" line should start with "| * ":

> | | 
> | * commit a393ed598e9fb11436f85bd58f1a38c82f2cadb7 (from 
> 33e70e70c0173d634826b998bdc304f93c0966b8)
> | | Merge: 2c1e6a3 33e70e7
> | | Author: Matthieu Moy 
> | | Date:   Tue Feb 5 22:05:33 2013 +0100

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Bug in "git log --graph -p -m" (version 1.7.7.6)

2013-02-05 Thread Dale R. Worley
I have found a situation where "git log" produces (apparently)
endless output.  Presumably this is a bug.  Following is a (Linux)
script that reliably reproduces the error for me (on Fedora 16):

--
set -ve

# Print the git version.
git --version

# Create respository.
rm -rf .git
git init

# Initial commit.
( echo 1 ; echo 2 ; echo 3 ) >file
git add file
git commit -m 'Commit P'
git branch B HEAD

# Next commit on master adds line "1a".
( echo 1 ; echo 1a ; echo 2 ; echo 3 ) >file
git add file
git commit -m 'Commit Q'

git checkout B

# Next commit on B adds line "2a".
( echo 1 ; echo 2 ; echo 2a ; echo 3 ) >file
git add file
git commit -m 'Commit R'

# Merge the two commits, but add line "3a" to the commit as well.
git checkout master
git merge --no-commit B
# Show what the merge produces.
cat file
# Add line "3a".
( echo 1 ; echo 1a ; echo 2 ; echo 2a ; echo 3 ; echo 3a ) >file
git commit -m 'Commit S'

# These log commands work.
git log
git log --graph
git log --graph -p

# This log command produces infinite output.
git log --graph -p -m
--

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html