Re: [PATCH v5 0/3] interactive git clean

2013-05-06 Thread Matthieu Moy
Eric Sunshine sunsh...@sunshineco.com writes:

 The pattern [y] will match file named 'y'. It probably is unusual for
 files named 'y', 'n', etc. to exist in the top-level directory, but
 the gitignore patterns already provide an escape hatch for these
 unusual cases.

But how does the user know that?

I'd rather stay away from dwim that works in 99% of cases but do
something dangerous in the 1% remaining, and complex un-guessable escape
scheme to solve these few cases. The two stages (yes/no/edit, and then
escape patterns) is clear, and does not require so many additional
keystrokes.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re[4]: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()

2013-05-06 Thread Ilya Basin
 The last error I encountered is:
 r7009 = 39805bb078983e34f2fc8d2c8c02d695d00d11c0 (refs/remotes/DMC4_Basic)
 Too many open files: Can't open file 
 '/home/il/builds/sicap/gitsvn/prd_dmc4.svn/db/revs/0/786': Too many open 
 files at 
 /.snapshots/persist/builds/git/git-git/perl/blib/lib/Git/SVN/Ra.pm line 
 282.
 
 I think It's unrelated to empty dirs.

EW Can you get an lsof on the git-svn process right before this?
IB /.snapshots/persist/builds/sicap/gitsvn/aaa/.git/A4O_OTQxWc
IB /.snapshots/persist/builds/sicap/gitsvn/aaa/.git/LfpcENJduN
IB /.snapshots/persist/builds/sicap/gitsvn/aaa/.git/Dkk7pN4Mpz
IB etc.

EW What's your open files limit?
IB 1024

IB Why no call to close() from temp_release() in Git.pm?

Found, fixed. It was related to empty dirs.



-- 

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re[3]: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()

2013-05-06 Thread Ilya Basin
Hi Eric. I'm out of spare time and I still unable to import my repo.
The code of SVN.pm is too complex. Please help me.
Here's the list of my issues:

* I think git-svn doesn't handle the case, when a tag is deleted.
  I expected it to rename the ref from tags/tagname to
  tags/tagname@rev, but that doesn't happen.
  If a tag is replaced, there's no way to tell what was the previous
  state of that tag: git-svn just rewrites the ref.
  On the contrary, the temporary refs (with @rev suffix), used for
  re-import subdir tags are kept after successful reimport, although
  they have no usage.

* As I said already, I have 25k revisions and 200 tags created from
  subdirs in trunk. This increases the import time from 2h to 12h.
  I would bear it, if it had to be done once, but fetching a new
  revision may cause re-import of all 25k revisions too.
  You should implement some mechanism to find the parent branches of
  subdir tags. Maybe the unused refs I mentioned in the previous issue
  are good candidates for that, but I would name them somehow
  different to distinguish with deleted/replaced tags/branches.

* There are mistake commits in the svn history, similar to this:

r21255 | xxx_xx_x | 2012-03-02 18:46:30 +0300 (Fri, 02 Mar 
2012) | 1 line
Changed paths:
  A /tags/dmagentenabler-4.1.31/DMAgent (from /tags:20998)

Delivery 4.1.31

  git-svn tries to creates a tag, containing dirs with other tags.
  Technically, behaves correctly, but it hangs, because of the size of
  the commit.
  To solve it, I had to edit the svn dump file:

 Node-path: tags/dmagentenabler-4.1.31/DMAgent
 Node-kind: dir
 Node-action: add
-Node-copyfrom-rev: 20998
-Node-copyfrom-path: tags
+Prop-content-length: 10
+Content-length: 10
+
+PROPS-END
 
 
 Revision-number: 21256

  It creates an empty dir, instead of copying. Since the author
  noticed the mistake, he immediately deleted the dir in the next
  revision, so it works.


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 0/3] interactive git clean

2013-05-06 Thread Eric Sunshine
Hi Matthieu,

On Mon, May 6, 2013 at 3:58 AM, Matthieu Moy
matthieu@grenoble-inp.fr wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 The pattern [y] will match file named 'y'. It probably is unusual for
 files named 'y', 'n', etc. to exist in the top-level directory, but
 the gitignore patterns already provide an escape hatch for these
 unusual cases.

 But how does the user know that?

Documentation. Junio also mentioned having a '?' in each prompt
explaining the options at that point.

 I'd rather stay away from dwim that works in 99% of cases but do
 something dangerous in the 1% remaining, and complex un-guessable escape
 scheme to solve these few cases. The two stages (yes/no/edit, and then
 escape patterns) is clear, and does not require so many additional
 keystrokes.

The scheme doesn't have to be unsafe. git-clean *knows* which files
are up for deletion. If one of those filenames conflicts with one of
the prompt options, --interactive mode can warn about the ambiguity at
the point the user types that particular ambiguous response, and ask
for clarification (did you mean 'yes, delete everything' or 'exclude
filename y'?). You get DWIM for 99.9% of the cases, and an extra
prompt for the other 0.1%. Nothing unsafe. Is such an implementation
more complicated or ugly than the separate 'edit' mode? Is the user
experience better or worse (more or less convenient)? Is it easier or
harder to document and explain?

Another observation which came to mind after my original email:

Would it make sense for --interactive mode to launch an editor with
the list of files scheduled for deletion and allow the user to remove
lines he does not want deleted (similar to git-rebase
--interactive). Depending upon case, this could be more or less
convenient than the proposed gitignore-patterned 'edit' mode. It could
be yet another option in the yes/no/prompt/edit menu or it could be
the default and only behavior of git-clean --interactive (again
similar to git-rebase --interactive).

I'm not advocating any particular user interface. The observations and
questions about convenience and user experience both here and in my
original email were made with the hope of promoting discussion before
the interface gets locked in. How clunky or fluid should it be? How
verbose or terse?

-- ES
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: trouble on windows network share

2013-05-06 Thread Thomas Rast
David Goldfarb d...@degel.com writes:

 Git works correctly under Linux (Ubuntu 12.04; git 1.7.9.5).  I've attached 
 the strace outputs. (Note: for reasons that are probably irrelevant, I needed 
 to run the commands sudo'd. Shout back if this is an issue).

 Under Windows 7, Cygwin git 1.7.9, commit fails:
   U:\foogit commit -m added foo2
   error: unable to find 0b89efdeef245ed6a0a7eacc5c578629a141f856
   fatal: 0b89efdeef245ed6a0a7eacc5c578629a141f856 is not a valid object

 For what it's worth, note that the file does exist.
   U:\fools -l .git/objects/0b
   total 1024
   -rwxrw-r-- 1   74 May  5 01:15 
 89efdeef245ed6a0a7eacc5c578629a141f856

 (I'm not sure why the permissions are trashed. Seems to be a Cygwin thing, or 
 maybe my Cygwin config. The ?? also  appears on local files, and I 
 believe also with files on the old Buffalo drive, so I don't think it is 
 relevant to the problem).  Just in case, here's the same dir, as seen from 
 the Ubuntu VM:

   deg@ubuntu:/mnt/users/foo$ ls -l .git/objects/0b
   total 64
   -rwxr-xr-x 0 root root 74 May  5 01:15 
 89efdeef245ed6a0a7eacc5c578629a141f856

 Again, note that there is some user permissions lossage here. I don't know 
 enough about Linux mount or CIFS, and apparently did the mount in a way that 
 everything seems to appear to be stuck owned by root. (same problem I hinted 
 at above). Hope this is not relevant to the problem.

 Here's how the same directory looks, when I'm ssh'd into the NAS box itself:

CentralPark:/shares/Users/foo# ls -l .git/objects/0b
   total 64
   -rwxrw-r-- 1 deg share 74 May  5 01:15 
 89efdeef245ed6a0a7eacc5c578629a141f856

 In any event, the symptoms don't seem to be a permissions problem, so all 
 this extra info is probably just a red herring, I hope.

Hrm.  What about what Jeff already asked of the OP (and AFAICS never got
a reply)?

} If it's a race condition between the write and the subsequent read in
} the same process, then it would be solved by looking at the object
} later. Does git cat-file -p 6838761d549cf76033d2e9faf5954e62839eb25d
} work, or is the object forever inaccessible?

In your case: git cat-file -p 0b89efdeef245ed6a0a7eacc5c578629a141f856

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Add porcelain command to revert a single staged file to its HEAD state while preserving its staged state

2013-05-06 Thread Thomas Rast
Dimitar Bonev dsbo...@gmail.com writes:

 One more argument against the suggestion of doing a commit ahead of
 time is that I like to think in separation to reduce complexity - in
 particular I like to think only for the working dir and index states,
 commits - I treat them as finished work. If I have to amend a commit,
 it is about fixing a mistake - adding/removing a wrong file, fixing a
 typo, that sort of things and not for actual work to be done.

Aside from what's been said already, there's a fundamental problem with
your approach: it doesn't scale in any direction.  You cannot use it if
you want to do the same dance with more than one version set aside,
because the index gives you exactly one more storage area.  And it does
not generalize in any useful way, whereas using temporary commits is
useful also if you want to briefly leave a branch with unfinished work
to do something else.  Furthermore, knowing how to work with temporary
commits (in the plural), and clean them up using 'rebase -i', *is* a
part of work to be done in most git workflows.  Namely, it is used to
reshape your branch from a how did I hack this up angle into a how do
I explain this to others angle.

So I hope I'm not misconstruing what Junio and Jonathan said, but I
think you can sum up our position as: your proposal is a very specific
solution to a very narrow problem -- that is only needed because your
environment (PowerShell) is arguably broken.  All the while, there are
several existing solutions that we consider more natural, and that
generalize to other use-cases.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: trouble on windows network share

2013-05-06 Thread David Goldfarb
Looks like it works.

From the windows machine:
U:\foogit cat-file -p 0b89efdeef245ed6a0a7eacc5c578629a141f856
100644 blob b02e7c87fe376a353ea4f014bdb3f5200a946b37foo1
100644 blob 2cbf64f759a62392ad9dfe1fb9c2cdb175876014foo2

U:\foo


Double-checking that nothing was fixed or changed when I earlier committed the 
file from Linux, here's a second test:


U:\foogit cat-file -p 0b89efdeef245ed6a0a7eacc5c578629a141f856
100644 blob b02e7c87fe376a353ea4f014bdb3f5200a946b37foo1
100644 blob 2cbf64f759a62392ad9dfe1fb9c2cdb175876014foo2

U:\foogit status
# On branch master
# Untracked files:
#   (use git add file... to include in what will be committed)
#
#   trace1
#   trace2
nothing added to commit but untracked files present (use git add to track)

U:\foogit add trace1

U:\foogit commit trace1 -m testing
error: unable to find cecae5b4c87ea21aef513fcfcd5c27fe87e0536f
fatal: cecae5b4c87ea21aef513fcfcd5c27fe87e0536f is not a valid object

U:\foogit cat-file -p cecae5b4c87ea21aef513fcfcd5c27fe87e0536f
100644 blob b02e7c87fe376a353ea4f014bdb3f5200a946b37foo1
100644 blob 2cbf64f759a62392ad9dfe1fb9c2cdb175876014foo2
100644 blob 19102815663d23f8b75a47e7a01965dcdc96468ctest.txt
100644 blob c9009b02950964cf1d5281125e6e2f647dd9dc16trace1

   U:\foo


David

-Original Message-
From: Thomas Rast [mailto:tr...@inf.ethz.ch] 
Sent: Monday, May 06, 2013 12:42 PM
To: David Goldfarb
Cc: git@vger.kernel.org
Subject: Re: trouble on windows network share

David Goldfarb d...@degel.com writes:

 Git works correctly under Linux (Ubuntu 12.04; git 1.7.9.5).  I've attached 
 the strace outputs. (Note: for reasons that are probably irrelevant, I needed 
 to run the commands sudo'd. Shout back if this is an issue).

 Under Windows 7, Cygwin git 1.7.9, commit fails:
   U:\foogit commit -m added foo2
   error: unable to find 0b89efdeef245ed6a0a7eacc5c578629a141f856
   fatal: 0b89efdeef245ed6a0a7eacc5c578629a141f856 is not a valid 
 object

 For what it's worth, note that the file does exist.
   U:\fools -l .git/objects/0b
   total 1024
   -rwxrw-r-- 1   74 May  5 01:15 
 89efdeef245ed6a0a7eacc5c578629a141f856

 (I'm not sure why the permissions are trashed. Seems to be a Cygwin thing, or 
 maybe my Cygwin config. The ?? also  appears on local files, and I 
 believe also with files on the old Buffalo drive, so I don't think it is 
 relevant to the problem).  Just in case, here's the same dir, as seen from 
 the Ubuntu VM:

   deg@ubuntu:/mnt/users/foo$ ls -l .git/objects/0b
   total 64
   -rwxr-xr-x 0 root root 74 May  5 01:15 
 89efdeef245ed6a0a7eacc5c578629a141f856

 Again, note that there is some user permissions lossage here. I don't know 
 enough about Linux mount or CIFS, and apparently did the mount in a way that 
 everything seems to appear to be stuck owned by root. (same problem I hinted 
 at above). Hope this is not relevant to the problem.

 Here's how the same directory looks, when I'm ssh'd into the NAS box itself:

CentralPark:/shares/Users/foo# ls -l .git/objects/0b
   total 64
   -rwxrw-r-- 1 deg share 74 May  5 01:15 
 89efdeef245ed6a0a7eacc5c578629a141f856

 In any event, the symptoms don't seem to be a permissions problem, so all 
 this extra info is probably just a red herring, I hope.

Hrm.  What about what Jeff already asked of the OP (and AFAICS never got a 
reply)?

} If it's a race condition between the write and the subsequent read in } the 
same process, then it would be solved by looking at the object } later. Does 
git cat-file -p 6838761d549cf76033d2e9faf5954e62839eb25d
} work, or is the object forever inaccessible?

In your case: git cat-file -p 0b89efdeef245ed6a0a7eacc5c578629a141f856

--
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: jn/config-ignore-inaccessible (Re: What's cooking in git.git (Apr 2013, #10; Mon, 29))

2013-05-06 Thread Thomas Rast
Jonathan Nieder jrnie...@gmail.com writes:

 Thomas Rast wrote:
 Junio C Hamano gits...@pobox.com writes:

 * jn/config-ignore-inaccessible (2013-04-15) 1 commit
  - config: allow inaccessible configuration under $HOME

  When $HOME is misconfigured to point at an unreadable directory, we
  used to complain and die. This loosens the check.

  I do not think we agreed that this is a good idea, though.

 As a data point: yesterday on IRC, two users complained that they each
 had this problem.

   http://colabti.org/irclogger/irclogger_log/git?date=2013-05-03#l3022
   http://colabti.org/irclogger/irclogger_log/git?date=2013-05-03#l3111

 I think the approach taken in the patch above is a good one.  If
 /etc/gitconfig contains important configuration, it is still not
 ignored, errors other than permissions reading ~/.gitconfig are
 still fatal, and permissions errors accessing ~/.gitconfig are no
 longer fatal because they are expected as something very common
 in normal setups.

 I haven't been able to convince myself there is a different, better
 behavior to be found.  Special-casing inaccessible $HOME while still
 forbidding inaccessible $HOME/.config/git and $HOME/.gitconfig would
 seem strange.

What I found iffy about all of it is that the current failures happen
really late: they prevent the children spawned by the daemon for repo
handling from doing any useful work, while the daemon itself chugs along
nicely.

Wouldn't it be better to (attempt to) reload configs immediately after
switching to the new user/group, and then either warn or exit?

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: trouble on windows network share

2013-05-06 Thread Thomas Rast
David Goldfarb d...@degel.com writes:

 Looks like it works.

 From the windows machine:
 U:\foogit cat-file -p 0b89efdeef245ed6a0a7eacc5c578629a141f856
 100644 blob b02e7c87fe376a353ea4f014bdb3f5200a946b37foo1
 100644 blob 2cbf64f759a62392ad9dfe1fb9c2cdb175876014foo2

 U:\foo


 Double-checking that nothing was fixed or changed when I earlier committed 
 the file from Linux, here's a second test:


 U:\foogit cat-file -p 0b89efdeef245ed6a0a7eacc5c578629a141f856
 100644 blob b02e7c87fe376a353ea4f014bdb3f5200a946b37foo1
 100644 blob 2cbf64f759a62392ad9dfe1fb9c2cdb175876014foo2
 
 U:\foogit status
 # On branch master
 # Untracked files:
 #   (use git add file... to include in what will be committed)
 #
 #   trace1
 #   trace2
 nothing added to commit but untracked files present (use git add to 
 track)
 
 U:\foogit add trace1
 
 U:\foogit commit trace1 -m testing

Note that specifying 'trace1' here is redundant, since you already added
it and had no other staged changes.  Perhaps you can re-run a test like
this without the extra argument for comparison.  That would tell us if
it matters that the write and read happen in the same process.

 error: unable to find cecae5b4c87ea21aef513fcfcd5c27fe87e0536f
 fatal: cecae5b4c87ea21aef513fcfcd5c27fe87e0536f is not a valid object
 
 U:\foogit cat-file -p cecae5b4c87ea21aef513fcfcd5c27fe87e0536f
 100644 blob b02e7c87fe376a353ea4f014bdb3f5200a946b37foo1
 100644 blob 2cbf64f759a62392ad9dfe1fb9c2cdb175876014foo2
 100644 blob 19102815663d23f8b75a47e7a01965dcdc96468ctest.txt
 100644 blob c9009b02950964cf1d5281125e6e2f647dd9dc16trace1

I'm inclined to just say that your FS is crazy.

What's unsatisfactory is that we already have a bunch of crazy FS
workarounds in move_temp_to_file(), which is one obvious candidate for
what is going on here.  So this wouldn't be something new; just another
craziness to work around.

For example, you could test the theory that rename() has something to do
with it by patching this into move_temp_to_file():

diff --git i/sha1_file.c w/sha1_file.c
index 67e815b..22af015 100644
--- i/sha1_file.c
+++ w/sha1_file.c
@@ -2635,6 +2635,10 @@ int move_temp_to_file(const char *tmpfile, const char 
*filename)
/* FIXME!!! Collision check here ? */
}
 
+   if (access(filename, R_OK)  0)
+   return error(access(%s, R_OK) failed immediately after 
rename(): %s,
+filename, strerror(errno));
+
 out:
if (adjust_shared_perm(filename))
return error(unable to set permission to '%s', filename);


-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: trouble on windows network share

2013-05-06 Thread David Goldfarb
Ok.  Continuing in the same shell:

U:\foogit status
# On branch master
# Changes to be committed:
#   (use git reset HEAD file... to unstage)
#
#   new file:   trace1
#
# Untracked files:
#   (use git add file... to include in what will be committed)
#
#   trace2

U:\foogit commit -m test b
error: Trying to write ref refs/heads/master with nonexistent object 
428dafc292b8396fc7b3c7f692dc9dfe1196a40a
fatal: cannot update HEAD ref

U:\foodir /s *a40a
 Volume in drive U is Users
 Volume Serial Number is FC97-3DA5

 Directory of U:\foo\.git\objects\42

05/06/2013  12:26 PM   152 
8dafc292b8396fc7b3c7f692dc9dfe1196a40a
   1 File(s)152 bytes

 Total Files Listed:
   1 File(s)152 bytes
   0 Dir(s)  1,594,477,576,192 bytes free

U:\foodate
The current date is: Mon 05/06/2013
Enter the new date: (mm-dd-yy)
U:\footime
The current time is: 13:10:49.55
Enter the new time:
U:\foogit cat-file -p 428dafc292b8396fc7b3c7f692dc9dfe1196a40a
tree cecae5b4c87ea21aef513fcfcd5c27fe87e0536f
parent f6a0de35d12a3b58f12bf1af4ff629b8b004ad82
author David Goldfarb d...@degel.com 1367834997 +0300
committer David Goldfarb d...@degel.com 1367834997 +0300

test b

U:\fooU:\foo



David

-Original Message-
From: Thomas Rast [mailto:tr...@inf.ethz.ch] 
Sent: Monday, May 06, 2013 1:08 PM
To: David Goldfarb
Cc: git@vger.kernel.org
Subject: Re: trouble on windows network share

David Goldfarb d...@degel.com writes:

 Looks like it works.

 From the windows machine:
 U:\foogit cat-file -p 0b89efdeef245ed6a0a7eacc5c578629a141f856
 100644 blob b02e7c87fe376a353ea4f014bdb3f5200a946b37foo1
 100644 blob 2cbf64f759a62392ad9dfe1fb9c2cdb175876014foo2

 U:\foo


 Double-checking that nothing was fixed or changed when I earlier committed 
 the file from Linux, here's a second test:


 U:\foogit cat-file -p 0b89efdeef245ed6a0a7eacc5c578629a141f856
 100644 blob b02e7c87fe376a353ea4f014bdb3f5200a946b37foo1
 100644 blob 2cbf64f759a62392ad9dfe1fb9c2cdb175876014foo2
 
 U:\foogit status
 # On branch master
 # Untracked files:
 #   (use git add file... to include in what will be committed)
 #
 #   trace1
 #   trace2
 nothing added to commit but untracked files present (use git add 
 to track)
 
 U:\foogit add trace1
 
 U:\foogit commit trace1 -m testing

Note that specifying 'trace1' here is redundant, since you already added it and 
had no other staged changes.  Perhaps you can re-run a test like this without 
the extra argument for comparison.  That would tell us if it matters that the 
write and read happen in the same process.

 error: unable to find cecae5b4c87ea21aef513fcfcd5c27fe87e0536f
 fatal: cecae5b4c87ea21aef513fcfcd5c27fe87e0536f is not a valid 
 object
 
 U:\foogit cat-file -p cecae5b4c87ea21aef513fcfcd5c27fe87e0536f
 100644 blob b02e7c87fe376a353ea4f014bdb3f5200a946b37foo1
 100644 blob 2cbf64f759a62392ad9dfe1fb9c2cdb175876014foo2
 100644 blob 19102815663d23f8b75a47e7a01965dcdc96468ctest.txt
 100644 blob c9009b02950964cf1d5281125e6e2f647dd9dc16trace1

I'm inclined to just say that your FS is crazy.

What's unsatisfactory is that we already have a bunch of crazy FS workarounds 
in move_temp_to_file(), which is one obvious candidate for what is going on 
here.  So this wouldn't be something new; just another craziness to work around.

For example, you could test the theory that rename() has something to do with 
it by patching this into move_temp_to_file():

diff --git i/sha1_file.c w/sha1_file.c
index 67e815b..22af015 100644
--- i/sha1_file.c
+++ w/sha1_file.c
@@ -2635,6 +2635,10 @@ int move_temp_to_file(const char *tmpfile, const char 
*filename)
/* FIXME!!! Collision check here ? */
}
 
+   if (access(filename, R_OK)  0)
+   return error(access(%s, R_OK) failed immediately after 
rename(): %s,
+filename, strerror(errno));
+
 out:
if (adjust_shared_perm(filename))
return error(unable to set permission to '%s', filename);


--
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] fast-import: only store commit objects

2013-05-06 Thread Michael Haggerty
On 05/03/2013 08:23 PM, Felipe Contreras wrote:
 On Fri, May 3, 2013 at 12:56 PM, Thomas Rast tr...@inf.ethz.ch wrote:
 Felipe Contreras felipe.contre...@gmail.com writes:
 
 How do we know that this doesn't break any users of fast-import?  Your
 comment isn't very reassuring:

 the vast majority of them will never be used again

 So what's with the minority?
 
 Actually I don't think there's any minority. If the client program
 doesn't store blobs, the blob marks are not used anyway. So there's no
 change.

I haven't been following this conversation in detail, but your proposed
change sounds like something that would break cvs2git [1].  Let me
explain what cvs2git does and why:

CVS stores all of the revisions of a single file in a single filename,v
file in rcsfile(5) format.  The revisions are stored as deltas ordered
so that a single revision can be reconstructed from a single serial read
of the file.

cvs2git reads each of these files once, reconstructing *all* of the
revisions for a file in a single go.  It then pours them into a
git-fast-import stream as blobs and sets a mark on each blob.

Only much later in the conversion does it have enough information to
reconstruct tree-wide commits.  At that time it outputs git-fast-import
data (to a second file) defining the git commits and their ancestry.
The contents are defined by referring to the marks of blobs from the
first git-fast-import stream file.

This strategy speeds up the conversion *enormously*.

So if I understand correctly that you are proposing to stop allowing
marks on blob objects to be set and/or referred to later, then I object
vociferously.

If I've misunderstood then I'll go back into my hole :-)

Michael

[1] http://cvs2svn.tigris.org/cvs2git.html

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] fast-import: only store commit objects

2013-05-06 Thread Thomas Rast
Michael Haggerty mhag...@alum.mit.edu writes:

 On 05/03/2013 08:23 PM, Felipe Contreras wrote:
 On Fri, May 3, 2013 at 12:56 PM, Thomas Rast tr...@inf.ethz.ch wrote:
 Felipe Contreras felipe.contre...@gmail.com writes:
 
 How do we know that this doesn't break any users of fast-import?  Your
 comment isn't very reassuring:

 the vast majority of them will never be used again

 So what's with the minority?
 
 Actually I don't think there's any minority. If the client program
 doesn't store blobs, the blob marks are not used anyway. So there's no
 change.

 I haven't been following this conversation in detail, but your proposed
 change sounds like something that would break cvs2git [1].  Let me
 explain what cvs2git does and why:

 CVS stores all of the revisions of a single file in a single filename,v
 file in rcsfile(5) format.  The revisions are stored as deltas ordered
 so that a single revision can be reconstructed from a single serial read
 of the file.

 cvs2git reads each of these files once, reconstructing *all* of the
 revisions for a file in a single go.  It then pours them into a
 git-fast-import stream as blobs and sets a mark on each blob.

 Only much later in the conversion does it have enough information to
 reconstruct tree-wide commits.  At that time it outputs git-fast-import
 data (to a second file) defining the git commits and their ancestry.
 The contents are defined by referring to the marks of blobs from the
 first git-fast-import stream file.

 This strategy speeds up the conversion *enormously*.

 So if I understand correctly that you are proposing to stop allowing
 marks on blob objects to be set and/or referred to later, then I object
 vociferously.

The proposed patch wants to stop writing marks (in --export-marks) for
anything but commits.  Does cvs2git depend on that?  I.e., are you using
two separate fast-import processes for the blob and tree/commit phases
you describe above?

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] fast-import: only store commit objects

2013-05-06 Thread Michael Haggerty
On 05/06/2013 12:32 PM, Thomas Rast wrote:
 Michael Haggerty mhag...@alum.mit.edu writes:
 
 On 05/03/2013 08:23 PM, Felipe Contreras wrote:
 On Fri, May 3, 2013 at 12:56 PM, Thomas Rast tr...@inf.ethz.ch wrote:
 Felipe Contreras felipe.contre...@gmail.com writes:

 How do we know that this doesn't break any users of fast-import?  Your
 comment isn't very reassuring:

 the vast majority of them will never be used again

 So what's with the minority?

 Actually I don't think there's any minority. If the client program
 doesn't store blobs, the blob marks are not used anyway. So there's no
 change.

 I haven't been following this conversation in detail, but your proposed
 change sounds like something that would break cvs2git [1].  Let me
 explain what cvs2git does and why:

 CVS stores all of the revisions of a single file in a single filename,v
 file in rcsfile(5) format.  The revisions are stored as deltas ordered
 so that a single revision can be reconstructed from a single serial read
 of the file.

 cvs2git reads each of these files once, reconstructing *all* of the
 revisions for a file in a single go.  It then pours them into a
 git-fast-import stream as blobs and sets a mark on each blob.

 Only much later in the conversion does it have enough information to
 reconstruct tree-wide commits.  At that time it outputs git-fast-import
 data (to a second file) defining the git commits and their ancestry.
 The contents are defined by referring to the marks of blobs from the
 first git-fast-import stream file.

 This strategy speeds up the conversion *enormously*.

 So if I understand correctly that you are proposing to stop allowing
 marks on blob objects to be set and/or referred to later, then I object
 vociferously.
 
 The proposed patch wants to stop writing marks (in --export-marks) for
 anything but commits.  Does cvs2git depend on that?  I.e., are you using
 two separate fast-import processes for the blob and tree/commit phases
 you describe above?

Yes, it can be handy to start loading the first blobfile in parallel
with the later stages of the conversion, before the second dumpfile is
ready.  In that case the user needs to pass --export-marks to the first
fast-import process to export marks on blobs so that the marks can be
passed to the second fast-import via --import-marks.

So the proposed change would break a documented use of cvs2git.

Making the export of blob marks optional would of course be OK, as long
as the default is to export them.

Michael


-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: trouble on windows network share

2013-05-06 Thread David Goldfarb
Oops, I earlier missed the second half of your message, where you suggested 
that I patch move_temp_to_file().
Sorry, I don't have the tool-chain or knowledge to recompile git into this 
win/Cygwin environment.

What I can do, realistically, is:
- Follow exact instructions.
- (with some obvious hesitation) run executables that you send me.
- Continue whatever tests you want with my existing git.

Note, too, that:
- This is occurring on a standard, off-the-shelf, NAS box, the Western Digital 
My Book Live
- I would guess that the craziness has more to do with the box's network 
stack than with its FS, since it behaves fine when mounted on Linux, but not on 
Windows.

David



-Original Message-
From: David Goldfarb 
Sent: Monday, May 06, 2013 1:13 PM
To: 'Thomas Rast'
Cc: git@vger.kernel.org
Subject: RE: trouble on windows network share

Ok.  Continuing in the same shell:

U:\foogit status
# On branch master
# Changes to be committed:
#   (use git reset HEAD file... to unstage)
#
#   new file:   trace1
#
# Untracked files:
#   (use git add file... to include in what will be committed)
#
#   trace2

U:\foogit commit -m test b
error: Trying to write ref refs/heads/master with nonexistent object 
428dafc292b8396fc7b3c7f692dc9dfe1196a40a
fatal: cannot update HEAD ref

U:\foodir /s *a40a
 Volume in drive U is Users
 Volume Serial Number is FC97-3DA5

 Directory of U:\foo\.git\objects\42

05/06/2013  12:26 PM   152 
8dafc292b8396fc7b3c7f692dc9dfe1196a40a
   1 File(s)152 bytes

 Total Files Listed:
   1 File(s)152 bytes
   0 Dir(s)  1,594,477,576,192 bytes free

U:\foodate
The current date is: Mon 05/06/2013
Enter the new date: (mm-dd-yy)
U:\footime
The current time is: 13:10:49.55
Enter the new time:
U:\foogit cat-file -p 428dafc292b8396fc7b3c7f692dc9dfe1196a40a
tree cecae5b4c87ea21aef513fcfcd5c27fe87e0536f
parent f6a0de35d12a3b58f12bf1af4ff629b8b004ad82
author David Goldfarb d...@degel.com 1367834997 +0300
committer David Goldfarb d...@degel.com 1367834997 +0300

test b

U:\fooU:\foo



David

-Original Message-
From: Thomas Rast [mailto:tr...@inf.ethz.ch]
Sent: Monday, May 06, 2013 1:08 PM
To: David Goldfarb
Cc: git@vger.kernel.org
Subject: Re: trouble on windows network share

David Goldfarb d...@degel.com writes:

 Looks like it works.

 From the windows machine:
 U:\foogit cat-file -p 0b89efdeef245ed6a0a7eacc5c578629a141f856
 100644 blob b02e7c87fe376a353ea4f014bdb3f5200a946b37foo1
 100644 blob 2cbf64f759a62392ad9dfe1fb9c2cdb175876014foo2

 U:\foo


 Double-checking that nothing was fixed or changed when I earlier committed 
 the file from Linux, here's a second test:


 U:\foogit cat-file -p 0b89efdeef245ed6a0a7eacc5c578629a141f856
 100644 blob b02e7c87fe376a353ea4f014bdb3f5200a946b37foo1
 100644 blob 2cbf64f759a62392ad9dfe1fb9c2cdb175876014foo2
 
 U:\foogit status
 # On branch master
 # Untracked files:
 #   (use git add file... to include in what will be committed)
 #
 #   trace1
 #   trace2
 nothing added to commit but untracked files present (use git add 
 to track)
 
 U:\foogit add trace1
 
 U:\foogit commit trace1 -m testing

Note that specifying 'trace1' here is redundant, since you already added it and 
had no other staged changes.  Perhaps you can re-run a test like this without 
the extra argument for comparison.  That would tell us if it matters that the 
write and read happen in the same process.

 error: unable to find cecae5b4c87ea21aef513fcfcd5c27fe87e0536f
 fatal: cecae5b4c87ea21aef513fcfcd5c27fe87e0536f is not a valid 
 object
 
 U:\foogit cat-file -p cecae5b4c87ea21aef513fcfcd5c27fe87e0536f
 100644 blob b02e7c87fe376a353ea4f014bdb3f5200a946b37foo1
 100644 blob 2cbf64f759a62392ad9dfe1fb9c2cdb175876014foo2
 100644 blob 19102815663d23f8b75a47e7a01965dcdc96468ctest.txt
 100644 blob c9009b02950964cf1d5281125e6e2f647dd9dc16trace1

I'm inclined to just say that your FS is crazy.

What's unsatisfactory is that we already have a bunch of crazy FS workarounds 
in move_temp_to_file(), which is one obvious candidate for what is going on 
here.  So this wouldn't be something new; just another craziness to work around.

For example, you could test the theory that rename() has something to do with 
it by patching this into move_temp_to_file():

diff --git i/sha1_file.c w/sha1_file.c
index 67e815b..22af015 100644
--- i/sha1_file.c
+++ w/sha1_file.c
@@ -2635,6 +2635,10 @@ int move_temp_to_file(const char *tmpfile, const char 
*filename)
/* FIXME!!! Collision check here ? */
}
 
+   if (access(filename, R_OK)  0)
+   return error(access(%s, R_OK) failed immediately after 

Re: another packed-refs race

2013-05-06 Thread Michael Haggerty
On 05/03/2013 10:38 AM, Jeff King wrote:
 I found another race related to the packed-refs code. Consider for a
 moment what happens when we are looking at refs and another process does
 a simultaneous git pack-refs --all --prune, updating packed-refs and
 deleting the loose refs.
 
 If we are resolving a single ref, then we will either find its loose
 form or not. If we do, we're done. If not, we can fall back on what was
 in the packed-refs file. If we read the packed-refs file at that point,
 we're OK. If the loose ref existed before but was pruned before we could
 read it, then we know the packed-refs file is already in place, because
 pack-refs would not have deleted the loose ref until it had finished
 writing the new file. But imagine that we already loaded the packed-refs
 file into memory earlier. We may be looking at an _old_ version of it
 that has an irrelevant sha1 from the older packed-refs file. Or it might
 not even exist in the packed-refs file at all, and we think the ref does
 not resolve.

 We could fix this by making sure our packed-refs file is up to date

s/file/cache/

 before using it. E.g., resolving a ref with the following sequence:
 
   1. Look for loose ref. If found, OK.
 
   2. Compare inode/size/mtime/etc of on-disk packed-refs to their values
  from the last time we loaded it. If they're not the same, reload
  packed-refs from disk. Otherwise, continue.
 
   3. Look for ref in in-memory packed-refs.
 
 Not too bad. We introduce one extra stat() for a ref that has been
 packed, and the scheme isn't very complicated.

Let me think out loud alongside your analysis...

By this mechanism the reader can ensure that it never uses a version of
the packed-refs file that is older than its information that the
corresponding loose ref is absent from the filesystem.

This is all assuming that the filesystem accesses have a defined order;
how is that guaranteed?  pack_refs() and commit_ref() both rely on
commit_lock_file(), which calls

close(fd) on the lockfile
rename(lk-filename, result_file)

prune_ref() locks the ref, verifies that its SHA-1 is unchanged, then
calls unlink(), then rollback_lock_file().

The atomicity of rename() guarantees that a reader sees either the old
or the new version of the file in question.  But what guarantees are
there about accesses across two files?  Suppose we start with ref foo
that exists only as a loose ref, and we have a pack-refs process doing

write packed-refs with foo
commit_lock_file() for packed-refs
read loose ref foo and verify that its SHA-1 is unchanged
unlink() loose ref foo

while another process is trying to read the reference:

look for loose ref foo
read packed-refs

Is there any guarantee that the second process can't see the loose ref
foo as being missing but nevertheless read the old version of
packed-refs?  I'm not strong enough on filesystem semantics to answer
that question.

 But what about enumerating refs via for_each_ref? It's possible to have
 the same problem there, and the packed-refs file may be moved into place
 midway through the process of enumerating the loose refs. So we may see
 refs/heads/master, but when we get to refs/remotes/origin/master, it has
 now been packed and pruned.

Yes.

 I _think_ we can get by with:
 
   1. Generate the complete sorted list of loose refs.
 
   2. Check that packed-refs is stat-clean, and reload if necessary, as
  above.
 
   3. Merge the sorted loose and packed lists, letting loose override
  packed (so even if we get repacked halfway through our loose
  traversal and get half of the refs there, it's OK to see duplicates
  in packed-refs, which we will ignore).
 
 This is not very far off of what we do now. Obviously we don't do the
 stat-clean check in step 2. But we also don't generate the complete list
 of loose refs before hitting the packed-refs file. Instead we lazily
 load the loose directories as needed. And of course we cache that
 information in memory, even though it may go out of date. I think the
 best we could do is keep a stat() for each individual directory we see,
 and check it before using the in-memory contents. That may be a lot of
 stats, but it's still better than actually opening each loose ref
 separately.

The loose refs cache is only used by the for_each_ref() functions, not
for looking up individual references.  Another approach would be to
change the top-level for_each_ref() functions to re-stat() all of the
loose references within the namespace that interests it, *then* verify
that the packed-ref cache is not stale, *then* start the iteration.
Then there would be no need to re-stat() files during the iteration.
(This would mean that we have to prohibit a second reference iteration
from being started while one is already in progress.)

Of course, clearing (part of) the loose reference cache invalidates any
pointers that other callers might have retained to refnames in the old
version of the cache.  

Re: another packed-refs race

2013-05-06 Thread Michael Haggerty
On 05/03/2013 07:28 PM, Jeff King wrote:
 On Fri, May 03, 2013 at 11:26:11AM +0200, Johan Herland wrote:
 
 You don't really need to be sure that packed-refs is up-to-date. You
 only need to make sure that don't rely on lazily loading loose refs
 _after_ you have loaded packed-refs.
 
 True. As long as you load them both together, and always make sure you
 do loose first, you'd be fine. But I think there will be corner cases
 where you have loaded _part_ of the loose ref namespace. I think part of
 the point of Michael's ref work is that if you call for_each_tag_ref,
 we would not waste time loading refs/remotes/ at all. If you then follow
 that with a call to for_each_ref, you would want to re-use the cached
 work from traversing refs/tags/, and then traverse refs/remotes/. You
 know that your cached packed-refs is good with respect to refs/tags/,
 but you don't with respect to refs/remotes.

Correct.

[I wonder if there really are a lot of iterations over overlapping parts
of the reference namespace within a single git process...]

 The following solution might work in both the resolve-a-single-ref and
 enumerating-refs case:

 0. Look for ref already cached in memory. If found, OK.

 1. Look for loose ref. If found, OK.

 2. If not found, load all loose refs and packed-refs from disk (in
 that order), and store in memory for remainder of this process. Never
 reload packed-refs from disk (unless you also reload all loose refs
 first).
 
 I think that would be correct (modulo that step 1 cannot happen for
 enumeration). But we would like to avoid loading all loose refs if we
 can. Especially on a cold cache, it can be quite slow, and you may not
 even care about those refs for the current operation (I do not recall
 the exact original motivation for the lazy loading, but it was something
 along those lines).

Lazy loading was first inspired by the observation that effectively
every git invocation was loading *all* loose references to do an
iteration over refs/replace/ (which I've never even used!)  This was
absolutely killing the performance of filter-branch, which creates a lot
of loose references and invokes git many times--even though the cache
was warm.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] fast-import: only store commit objects

2013-05-06 Thread Johannes Schindelin
Hi Thomas,

On Mon, 6 May 2013, Thomas Rast wrote:

 The proposed patch wants to stop writing marks (in --export-marks) for
 anything but commits.

Then it should not go in.

Ciao,
Dscho
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] fast-export: improve speed by skipping blobs

2013-05-06 Thread Jeff King
On Sun, May 05, 2013 at 05:38:53PM -0500, Felipe Contreras wrote:

 We don't care about blobs, or any object other than commits, but in
 order to find the type of object, we are parsing the whole thing, which
 is slow, specially in big repositories with lots of big files.

I did a double-take on reading this subject line and first paragraph,
thinking surely fast-export needs to actually output blobs?.

Reading the patch, I see that this is only about not bothering to load
blob marks from --import-marks. It might be nice to mention that in the
commit message, which is otherwise quite confusing.

I'm also not sure why your claim we don't care about blobs is true,
because naively we would want future runs of fast-export to avoid having
to write out the whole blob content when mentioning the blob again. I
think one argument could be if we write a mark for blob X, we will also
have written a mark for commit Y which contains it; on subsequent runs,
we will just show the mark for Y in the first place, and not even care
about showing X (as a part of Y) either way. We would only refer to the
mark for X if it appears as part of a different commit, but that is a
rare case not worth worrying about.

Does that match your reasoning?

 Before this, loading the objects of a fresh emacs import, with 260598
 blobs took 14 minutes, after this patch, it takes 3 seconds.

Presumably most of that speed improvement comes from not parsing the
blob objects. I wonder if you could get similar speedups by applying the
do not bother parsing rule from your patch 3. You would still incur
some cost to create a struct blob, but it may or may not be
measurable.  That would mean we get the case not worth worrying about
from above for free. I doubt it would make that big a difference,
though, given the rarity of it. So I am OK with it either way.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Segfault in git merge-tree (1.8.2.2)

2013-05-06 Thread Andreas Jacobsen
Hi,

I'm getting a segfault in git merge-tree using v1.8.2.2 on MacOS
10.8.3. I can't share the repo, but I can build patches and check if
they fix the problem :)

Here's a bt:

(gdb) run
Starting program:
/usr/local/Cellar/git/1.8.2.2/libexec/git-core/git-merge-tree
027058e6ac8d03e029c4e1455bf90f63cd20e65b FETCH_HEAD master
Reading symbols for shared libraries +.. done

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x
0x00010003e532 in threeway_callback ()
(gdb) bt
#0  0x00010003e532 in threeway_callback ()
#1  0x0001000da03b in traverse_trees ()
#2  0x00010003e46b in threeway_callback ()
#3  0x0001000da03b in traverse_trees ()
#4  0x00010003e46b in threeway_callback ()
#5  0x0001000da03b in traverse_trees ()
#6  0x00010003e46b in threeway_callback ()
#7  0x0001000da03b in traverse_trees ()
#8  0x00010003e46b in threeway_callback ()
#9  0x0001000da03b in traverse_trees ()
#10 0x00010003e46b in threeway_callback ()
#11 0x0001000da03b in traverse_trees ()
#12 0x00010003e46b in threeway_callback ()
#13 0x0001000da03b in traverse_trees ()
#14 0x00010003e46b in threeway_callback ()
#15 0x0001000da03b in traverse_trees ()
#16 0x00010003e46b in threeway_callback ()
#17 0x0001000da03b in traverse_trees ()
#18 0x00010003e46b in threeway_callback ()
#19 0x0001000da03b in traverse_trees ()
#20 0x00010003e46b in threeway_callback ()
#21 0x0001000da03b in traverse_trees ()
#22 0x00010003e46b in threeway_callback ()
#23 0x0001000da03b in traverse_trees ()
#24 0x00010003df02 in cmd_merge_tree ()
#25 0x00010e99 in handle_internal_command ()
#26 0x00010c59 in main ()

-Andreas
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: trouble on windows network share

2013-05-06 Thread Pyeron, Jason J CTR (US)
 -Original Message-
 From: Thomas Rast
 Sent: Monday, May 06, 2013 5:42 AM
 
 David Goldfarb d...@degel.com writes:
 
  Git works correctly under Linux (Ubuntu 12.04; git 1.7.9.5).  I've
 attached the strace outputs. (Note: for reasons that are probably
 irrelevant, I needed to run the commands sudo'd. Shout back if this is
 an issue).
 
  Under Windows 7, Cygwin git 1.7.9, commit fails:
U:\foogit commit -m added foo2
error: unable to find 0b89efdeef245ed6a0a7eacc5c578629a141f856
fatal: 0b89efdeef245ed6a0a7eacc5c578629a141f856 is not a valid
 object
 
  For what it's worth, note that the file does exist.
U:\fools -l .git/objects/0b
total 1024
-rwxrw-r-- 1   74 May  5 01:15
 89efdeef245ed6a0a7eacc5c578629a141f856
 
  (I'm not sure why the permissions are trashed. Seems to be a Cygwin
 thing, or maybe my Cygwin config. The ?? also  appears on local
 files, and I believe also with files on the old Buffalo drive, so I
 don't think it is relevant to the problem).  Just in case, here's the
 same dir, as seen from the Ubuntu VM:
 
deg@ubuntu:/mnt/users/foo$ ls -l .git/objects/0b
total 64
-rwxr-xr-x 0 root root 74 May  5 01:15
 89efdeef245ed6a0a7eacc5c578629a141f856
 
  Again, note that there is some user permissions lossage here. I don't
 know enough about Linux mount or CIFS, and apparently did the mount in
 a way that everything seems to appear to be stuck owned by root. (same
 problem I hinted at above). Hope this is not relevant to the problem.
 
  Here's how the same directory looks, when I'm ssh'd into the NAS box
 itself:
 
 CentralPark:/shares/Users/foo# ls -l .git/objects/0b
total 64
-rwxrw-r-- 1 deg share 74 May  5 01:15
 89efdeef245ed6a0a7eacc5c578629a141f856
 
  In any event, the symptoms don't seem to be a permissions problem, so
 all this extra info is probably just a red herring, I hope.
 
 Hrm.  What about what Jeff already asked of the OP (and AFAICS never
 got
 a reply)?

If referring to me, then yes but it was too big for the list.

 
 } If it's a race condition between the write and the subsequent read in
 } the same process, then it would be solved by looking at the object
 } later. Does git cat-file -p
 6838761d549cf76033d2e9faf5954e62839eb25d
 } work, or is the object forever inaccessible?
 
 In your case: git cat-file -p 0b89efdeef245ed6a0a7eacc5c578629a141f856




smime.p7s
Description: S/MIME cryptographic signature


Re: Segfault in git merge-tree (1.8.2.2)

2013-05-06 Thread John Keeping
On Mon, May 06, 2013 at 03:02:10PM +0200, Andreas Jacobsen wrote:
 I'm getting a segfault in git merge-tree using v1.8.2.2 on MacOS
 10.8.3. I can't share the repo, but I can build patches and check if
 they fix the problem :)

Can you rebuild with debugging information and try the backtrace again?

Something like:

make CFLAGS='-O0 -g'

Then use the git in bin-wrappers/.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] remove the impression of unexpectedness when access is denied

2013-05-06 Thread Junio C Hamano
Jonathan Nieder jrnie...@gmail.com writes:

 I ran into this message for the first time today.

  $ git fetch --all
  Fetching origin
  remote: Counting objects: 368, done.
 [...]
  Fetching gitk
  fatal: Could not read from remote repository.

  Please make sure you have the correct access rights
  and the repository exists.
  error: Could not fetch gitk
  Fetching debian
  Fetching pape
 [...]

 The gitk remote refers to git://git.kernel.org/pub/scm/gitk/gitk.
 Using ls-remote to contact it produces the same result.  The message
 is correct: the repository does not exist.

 Impressions:

  * Looking at Could not read, it is not clear what could not read
and why.  GIT_TRACE_PACKET tells me the interaction was

   me git-upload-pack /pub/scm/gitk/gitk\0host=git.kernel.org\0
   them (hangup)

Would it make sense for the server to send an ERR packet to give
a more helpful diagnosis?

I think git-daemon does so (or at least attempts to do so);
path_ok() uses enter_repo() to check if the given path is a
repository, returns NULL to run_service(), whichh in turn calls
daemon_error() that does the ERR thing.

  * The spacing and capitalization is odd and makes it not flow well
with the rest of the output.  I suspect it would be easier to read
with the error separated from hints:

   Fetching gitk
   fatal: the remote server sent an empty response
   hint: does the repository exist?
   hint: do you have the correct access rights?
   error: Could not fetch gitk
   Fetching debian

If a server is misconfigured and just decides to send an empty
response for no good reason, the output would still be true.

It does sound better. Also s/Could not fetch/could not fetch/.

  * The error message is the same whether the server returned no
response or an incomplete pkt-line.  Maybe in the latter case it
should print the hung up unexpectedly thing.

OK.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] contrib/hooks/post-receive-email: get description from repo.git/config

2013-05-06 Thread Trond Hasle Amundsen

Hello,

The included patch attempts to improve post-receive-email. It's a
trivial change, but one that helps us Gitolite users. Comments are
welcome, and this is my first attempt at contributing to the Git
project. Please keep me on CC as I'm not on the list.

From 878a7af9088e2bcc3afc9b09b9023f1f188c844b Mon Sep 17 00:00:00 2001
From: Trond Hasle Amundsen t.h.amund...@usit.uio.no
Date: Mon, 6 May 2013 15:41:25 +0200
Subject: [PATCH] contrib/hooks/post-receive-email: get description from repo.git/config

When getting the project description, we first try gitweb.description
entry in repo.git/config, but repo.git/description takes precedence if
it exists. This behaviour mimics that of Gitweb, and is what we want
when using Gitolite, which deletes repo.git/description upon repo
creation and rather maintains a gitweb.description entry in
repo.git/config if a description is configured.

Signed-off-by: Trond Hasle Amundsen t.h.amund...@usit.uio.no
---
 contrib/hooks/post-receive-email |9 -
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/contrib/hooks/post-receive-email b/contrib/hooks/post-receive-email
index 0e5b72d..6ce046a 100755
--- a/contrib/hooks/post-receive-email
+++ b/contrib/hooks/post-receive-email
@@ -714,7 +714,14 @@ if [ -z $GIT_DIR ]; then
 	exit 1
 fi
 
-projectdesc=$(sed -ne '1p' $GIT_DIR/description 2/dev/null)
+# Get the repo's description. First try gitweb.description entry in
+# repo.git/config, but repo.git/description takes precedence if it
+# exists. This behaviour mimics that of Gitweb.
+projectdesc=$(git config gitweb.description)
+if [ -f $GIT_DIR/description ]; then
+projectdesc=$(sed -ne '1p' $GIT_DIR/description 2/dev/null)
+fi
+
 # Check if the description is unchanged from it's default, and shorten it to
 # a more manageable length if it is
 if expr $projectdesc : Unnamed repository.*$ /dev/null
-- 
1.7.1


Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo


Re: Segfault in git merge-tree (1.8.2.2)

2013-05-06 Thread Andreas Jacobsen
Sure, here you go, this time built from the HEAD I found on github
(7d3ccdffb5d28970dd7a4d177cfcca690ccd0c22) with:

NO_GETTEXT=1 make prefix=/usr/local/Cellar/git/HEAD CC=cc CFLAGS='-O0
-g' install (this is homebrew's setup, but I built it manually rather
than using the recipe.)

And the gdb output:

(gdb) set args merge-tree 027058e6ac8d03e029c4e1455bf90f63cd20e65b
FETCH_HEAD master
(gdb) run
Starting program: /usr/local/bin/git merge-tree
027058e6ac8d03e029c4e1455bf90f63cd20e65b FETCH_HEAD master
Reading symbols for shared libraries +.. done

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x
0x00010006f1a3 in add_merge_entry (entry=0x100432ba0) at
builtin/merge-tree.c:24
24 *merge_result_end = entry;
(gdb) bt
#0  0x00010006f1a3 in add_merge_entry (entry=0x100432ba0) at
builtin/merge-tree.c:24
#1  0x00010006ee24 in unresolved (info=0x7fff5fbfd508,
n=0x1004367c0) at builtin/merge-tree.c:270
#2  0x00010006eaa2 in threeway_callback (n=3, mask=4, dirmask=0,
entry=0x1004367c0, info=0x7fff5fbfd508) at builtin/merge-tree.c:330
#3  0x00010018eef2 in traverse_trees (n=3, t=0x7fff5fbfd5d0,
info=0x7fff5fbfd508) at tree-walk.c:407
#4  0x00010006e8e8 in merge_trees_recursive (t=0x7fff5fbfd5d0,
base=0x100436780
bibliotek/httpservice/src/main/java/no/ok/it/lib/http/service,
df_conflict=1) at builtin/merge-tree.c:341
#5  0x00010006f082 in unresolved_directory (info=0x7fff5fbfd7d8,
n=0x100436660, df_conflict=1) at builtin/merge-tree.c:221
#6  0x00010006ed2c in unresolved (info=0x7fff5fbfd7d8,
n=0x100436660) at builtin/merge-tree.c:258
#7  0x00010006eaa2 in threeway_callback (n=3, mask=4, dirmask=4,
entry=0x100436660, info=0x7fff5fbfd7d8) at builtin/merge-tree.c:330
#8  0x00010018eef2 in traverse_trees (n=3, t=0x7fff5fbfd8a0,
info=0x7fff5fbfd7d8) at tree-walk.c:407
#9  0x00010006e8e8 in merge_trees_recursive (t=0x7fff5fbfd8a0,
base=0x100435e70
bibliotek/httpservice/src/main/java/no/ok/it/lib/http,
df_conflict=1) at builtin/merge-tree.c:341
#10 0x00010006f082 in unresolved_directory (info=0x7fff5fbfdaa8,
n=0x100435d50, df_conflict=1) at builtin/merge-tree.c:221
#11 0x00010006ed2c in unresolved (info=0x7fff5fbfdaa8,
n=0x100435d50) at builtin/merge-tree.c:258
#12 0x00010006eaa2 in threeway_callback (n=3, mask=4, dirmask=4,
entry=0x100435d50, info=0x7fff5fbfdaa8) at builtin/merge-tree.c:330
#13 0x00010018eef2 in traverse_trees (n=3, t=0x7fff5fbfdb70,
info=0x7fff5fbfdaa8) at tree-walk.c:407
#14 0x00010006e8e8 in merge_trees_recursive (t=0x7fff5fbfdb70,
base=0x100435d10 bibliotek/httpservice/src/main/java/no/ok/it/lib,
df_conflict=1) at builtin/merge-tree.c:341
#15 0x00010006f082 in unresolved_directory (info=0x7fff5fbfdd78,
n=0x100436450, df_conflict=1) at builtin/merge-tree.c:221
#16 0x00010006ed2c in unresolved (info=0x7fff5fbfdd78,
n=0x100436450) at builtin/merge-tree.c:258
#17 0x00010006eaa2 in threeway_callback (n=3, mask=4, dirmask=4,
entry=0x100436450, info=0x7fff5fbfdd78) at builtin/merge-tree.c:330
#18 0x00010018eef2 in traverse_trees (n=3, t=0x7fff5fbfde40,
info=0x7fff5fbfdd78) at tree-walk.c:407
#19 0x00010006e8e8 in merge_trees_recursive (t=0x7fff5fbfde40,
base=0x100436400 bibliotek/httpservice/src/main/java/no/ok/it,
df_conflict=1) at builtin/merge-tree.c:341
#20 0x00010006f082 in unresolved_directory (info=0x7fff5fbfe048,
n=0x100433c80, df_conflict=1) at builtin/merge-tree.c:221
#21 0x00010006ed2c in unresolved (info=0x7fff5fbfe048,
n=0x100433c80) at builtin/merge-tree.c:258
#22 0x00010006eaa2 in threeway_callback (n=3, mask=4, dirmask=4,
entry=0x100433c80, info=0x7fff5fbfe048) at builtin/merge-tree.c:330
#23 0x00010018eef2 in traverse_trees (n=3, t=0x7fff5fbfe110,
info=0x7fff5fbfe048) at tree-walk.c:407
#24 0x00010006e8e8 in merge_trees_recursive (t=0x7fff5fbfe110,
base=0x100433c50 bibliotek/httpservice/src/main/java/no/ok,
df_conflict=1) at builtin/merge-tree.c:341
#25 0x00010006f082 in unresolved_directory (info=0x7fff5fbfe318,
n=0x10042d2a0, df_conflict=1) at builtin/merge-tree.c:221
#26 0x00010006ed2c in unresolved (info=0x7fff5fbfe318,
n=0x10042d2a0) at builtin/merge-tree.c:258
#27 0x00010006eaa2 in threeway_callback (n=3, mask=4, dirmask=4,
entry=0x10042d2a0, info=0x7fff5fbfe318) at builtin/merge-tree.c:330
#28 0x00010018eef2 in traverse_trees (n=3, t=0x7fff5fbfe3e0,
info=0x7fff5fbfe318) at tree-walk.c:407
#29 0x00010006e8e8 in merge_trees_recursive (t=0x7fff5fbfe3e0,
base=0x10042d250 bibliotek/httpservice/src/main/java/no,
df_conflict=1) at builtin/merge-tree.c:341
#30 0x00010006f082 in unresolved_directory (info=0x7fff5fbfe5e8,
n=0x100435750, df_conflict=1) at builtin/merge-tree.c:221
#31 0x00010006ed2c in unresolved (info=0x7fff5fbfe5e8,
n=0x100435750) at builtin/merge-tree.c:258
#32 0x00010006eaa2 in threeway_callback (n=3, mask=4, dirmask=4,
entry=0x100435750, 

Re: Segfault in git merge-tree (1.8.2.2)

2013-05-06 Thread John Keeping
On Mon, May 06, 2013 at 04:13:28PM +0200, Andreas Jacobsen wrote:
 Sure, here you go, this time built from the HEAD I found on github
 (7d3ccdffb5d28970dd7a4d177cfcca690ccd0c22) with:
 
 NO_GETTEXT=1 make prefix=/usr/local/Cellar/git/HEAD CC=cc CFLAGS='-O0
 -g' install (this is homebrew's setup, but I built it manually rather
 than using the recipe.)
 
 And the gdb output:
 
 (gdb) set args merge-tree 027058e6ac8d03e029c4e1455bf90f63cd20e65b
 FETCH_HEAD master
 (gdb) run
 Starting program: /usr/local/bin/git merge-tree
 027058e6ac8d03e029c4e1455bf90f63cd20e65b FETCH_HEAD master
 Reading symbols for shared libraries +.. done
 
 Program received signal EXC_BAD_ACCESS, Could not access memory.
 Reason: KERN_INVALID_ADDRESS at address: 0x
 0x00010006f1a3 in add_merge_entry (entry=0x100432ba0) at
 builtin/merge-tree.c:24
 24 *merge_result_end = entry;

Thanks.  I have an idea of what's going on, but the set up is in an
earlier pass and it only fails the next time it gets into
add_merge_entry.

Can you try adding the following patch on top?  Hopefully the added
debug is in the right caller, otherwise the new assert at the top will
point us to the right one.

-- 8 --
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index ec49917..8eebab7 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -23,6 +23,7 @@ static void add_merge_entry(struct merge_list *entry)
 {
*merge_result_end = entry;
merge_result_end = entry-next;
+   assert(merge_result_end);
 }
 
 static void merge_trees_recursive(struct tree_desc t[3], const char *base, int 
df_conflict);
@@ -267,6 +268,12 @@ static void unresolved(const struct traverse_info *info, 
struct name_entry n[3])
if (n[0].mode  !S_ISDIR(n[0].mode))
entry = link_entry(1, info, n + 0, entry);
 
+   if (!entry) {
+   fprintf(stderr, n[0].mode = %d\nn[1].mode = %d\nn[2].mode = 
%d\n,
+   n[0].mode, n[1].mode, n[2].mode);
+   assert(FALSE);
+   }
+
add_merge_entry(entry);
 }
 
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Segfault in git merge-tree (1.8.2.2)

2013-05-06 Thread John Keeping
On Mon, May 06, 2013 at 03:29:23PM +0100, John Keeping wrote:
 On Mon, May 06, 2013 at 04:13:28PM +0200, Andreas Jacobsen wrote:
  Sure, here you go, this time built from the HEAD I found on github
  (7d3ccdffb5d28970dd7a4d177cfcca690ccd0c22) with:
  
  NO_GETTEXT=1 make prefix=/usr/local/Cellar/git/HEAD CC=cc CFLAGS='-O0
  -g' install (this is homebrew's setup, but I built it manually rather
  than using the recipe.)
  
  And the gdb output:
  
  (gdb) set args merge-tree 027058e6ac8d03e029c4e1455bf90f63cd20e65b
  FETCH_HEAD master
  (gdb) run
  Starting program: /usr/local/bin/git merge-tree
  027058e6ac8d03e029c4e1455bf90f63cd20e65b FETCH_HEAD master
  Reading symbols for shared libraries +.. 
  done
  
  Program received signal EXC_BAD_ACCESS, Could not access memory.
  Reason: KERN_INVALID_ADDRESS at address: 0x
  0x00010006f1a3 in add_merge_entry (entry=0x100432ba0) at
  builtin/merge-tree.c:24
  24 *merge_result_end = entry;
 
 Thanks.  I have an idea of what's going on, but the set up is in an
 earlier pass and it only fails the next time it gets into
 add_merge_entry.
 
 Can you try adding the following patch on top?  Hopefully the added
 debug is in the right caller, otherwise the new assert at the top will
 point us to the right one.

Actually, don't worry about it.  I've tracked down the problem and have
a failing test case for t4300.  Patch to follow shortly.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Add porcelain command to revert a single staged file to its HEAD state while preserving its staged state

2013-05-06 Thread Junio C Hamano
Dimitar Bonev dsbo...@gmail.com writes:

[administrivia: please do not drop people out of Cc list]

 Actually this is not the case as I tried to explain with the 'git
 commit' followed by 'git checkout HEAD~1 -- targetfile' followed by
 'git commit --amend' example. The index and the working dir states are
 very well related.

That invites another question: if it is very well related, why isn't
it an option to start from the state you have in the working tree
(i.e. doing nothing), or in the index (i.e. git checkout
controller)?

But the answer does not matter that much in the end.

 Lets imagine that I am adding an MVC feature X - it
 could be implemented with 3 or more files. If I stage the files and
 came up with an idea that requires a major rewrite of one of these
 files - lets say the controller one - then it is more convenient to
 checkout the file's HEAD state and build on top of it - I had been
 doing just that right before staging

So you have satisfactory changes for view and model in the index,
and changes to the controller that maybe is OK but you think it
could be improved, and you want to go back to the clean slate to
rethink the approach only for the controller part?

I think the story is essentially the same.  Let's say you did

git diff HEAD controller | git apply -R
edit controller

to get rid of the changes in the working tree, further worked on the
controller part, and came up with a different implementation.  Then
you would presumably do

git add controller

but at that point you would lose the maybe OK version you had in
the index.  What if you think the version in the working tree might
end up being better than maybe OK but can still be improved
further?  You never git add until the working tree version gets
into truly a better shape?

What happens fairly often is that you end up having more than _one_
versions that are both better than the version from the HEAD but
cannot immediately say one is clearly better than the others in all
counts, and at some point, you would need more than one intermediate
states (while the index can have only one), to keep these competing
versions. The next thing you would want to do is to take a deep
breath and pick better parts from these good versions to come up
with the single final version. Keeping one good version in the index
does not let you do so.

 ... commits - I treat them as finished work.

I think people should learn to take the biggest advantage of using a
distributed source control system, which is that commits do not have
to be finished work, until you choose to declare they are done and
push them out for others to see.

Fear of commitment is a disease we do not have to suffer once we
graduated centralized systems ;-).
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] fast-export: improve speed by skipping blobs

2013-05-06 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 On Sun, May 05, 2013 at 05:38:53PM -0500, Felipe Contreras wrote:

 We don't care about blobs, or any object other than commits, but in
 order to find the type of object, we are parsing the whole thing, which
 is slow, specially in big repositories with lots of big files.

 I did a double-take on reading this subject line and first paragraph,
 thinking surely fast-export needs to actually output blobs?.

 Reading the patch, I see that this is only about not bothering to load
 blob marks from --import-marks. It might be nice to mention that in the
 commit message, which is otherwise quite confusing.

I had the same reaction first, but not writing the blob _objects_
out to the output stream would not make any sense, so it was fairly
easy to guess what the author wanted to say ;-).

 I'm also not sure why your claim we don't care about blobs is true,
 because naively we would want future runs of fast-export to avoid having
 to write out the whole blob content when mentioning the blob again.

The existing documentation is fairly clear that marks for objects
other than commits are not exported, and the import-marks codepath
discards anything but commits, so there is no mechanism for the
existing fast-export users to leave blob marks in the marks file for
later runs of fast-export to take advantage of.  The second
invocation cannot refer to such a blob in the first place.

The story is different on the fast-import side, where we do say we
dump the full table and a later run can depend on these marks.

By discarding marks on blobs, we may be robbing some optimization
possibilities, and by discarding marks on tags, we may be robbing
some features, from users of fast-export; we might want to add an
option --use-object-marks={blob,commit,tag} or something to both
fast-export and fast-import, so that the former can optionally write
marks for non-commits out, and the latter can omit non commit marks
if the user do not need them. But that is a separate issue.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] fast-import: only store commit objects

2013-05-06 Thread Junio C Hamano
Michael Haggerty mhag...@alum.mit.edu writes:

 Yes, it can be handy to start loading the first blobfile in parallel
 with the later stages of the conversion, before the second dumpfile is
 ready.  In that case the user needs to pass --export-marks to the first
 fast-import process to export marks on blobs so that the marks can be
 passed to the second fast-import via --import-marks.

 So the proposed change would break a documented use of cvs2git.

 Making the export of blob marks optional would of course be OK, as long
 as the default is to export them.

Thanks for a concise summary.  Your use case fits exactly what
Felipe conjectured as the nonexistent minority.

An option that lets the caller say I only care about marks on these
types of objects to be written to (and read from) the exported marks
file would help Felipe's use case without harming your use case,
and would be a sane and safe way to go.


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] merge-tree: handle directory/empty conflict correctly

2013-05-06 Thread John Keeping
git-merge-tree causes a null pointer dereference when a directory
entry exists in only one or two of the three trees being compared with
no corresponding entry in the other tree(s).

When this happens, we want to handle the entry as a directory and not
attempt to mark it as a file merge.  Do this by setting the entries bit
in the directory mask when the entry is missing or when it is a
directory, only performing the file comparison when we know that a file
entry exists.

Reported-by: Andreas Jacobsen andr...@andreasjacobsen.com
Signed-off-by: John Keeping j...@keeping.me.uk
---
Andreas, can you try this patch and see if it fixes your test case?

 builtin/merge-tree.c  |  6 +-
 t/t4300-merge-tree.sh | 51 +++
 2 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index ec49917..61cbde4 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -251,7 +251,11 @@ static void unresolved(const struct traverse_info *info, 
struct name_entry n[3])
 
for (i = 0; i  3; i++) {
mask |= (1  i);
-   if (n[i].mode  S_ISDIR(n[i].mode))
+   /*
+* Treat missing entries as directories so that we return
+* after unresolved_directory has handled this.
+*/
+   if (!n[i].mode || S_ISDIR(n[i].mode))
dirmask |= (1  i);
}
 
diff --git a/t/t4300-merge-tree.sh b/t/t4300-merge-tree.sh
index 2defb42..9015e47 100755
--- a/t/t4300-merge-tree.sh
+++ b/t/t4300-merge-tree.sh
@@ -259,6 +259,57 @@ EXPECTED
test_cmp expected actual
 '
 
+test_expect_success 'tree add A, B (same)' '
+   cat expect -\EOF 
+   EOF
+   git reset --hard initial 
+   mkdir sub 
+   test_commit add sub/file sub/file file add-tree-A 
+   git merge-tree initial add-tree-A add-tree-A actual 
+   test_cmp expect actual
+'
+
+test_expect_success 'tree add A, B (different)' '
+   cat expect -\EOF 
+   added in both
+ our100644 43d5a8ed6ef6c00ff775008633f95787d088285d sub/file
+ their  100644 ba629238ca89489f2b350e196ca445e09d8bb834 sub/file
+   @@ -1 +1,5 @@
+   + .our
+AAA
+   +===
+   +BBB
+   + .their
+   EOF
+   git reset --hard initial 
+   mkdir sub 
+   test_commit add sub/file sub/file AAA add-tree-a-b-A 
+   git reset --hard initial 
+   mkdir sub 
+   test_commit add sub/file sub/file BBB add-tree-a-b-B 
+   git merge-tree initial add-tree-a-b-A add-tree-a-b-B actual 
+   test_cmp expect actual
+'
+
+test_expect_success 'tree unchanged A, removed B' '
+   cat expect -\EOF 
+   removed in remote
+ base   100644 43d5a8ed6ef6c00ff775008633f95787d088285d sub/file
+ our100644 43d5a8ed6ef6c00ff775008633f95787d088285d sub/file
+   @@ -1 +0,0 @@
+   -AAA
+   EOF
+   git reset --hard initial 
+   mkdir sub 
+   test_commit add sub/file sub/file AAA tree-remove-b-initial 
+   git rm sub/file 
+   test_tick 
+   git commit -m remove sub/file 
+   git tag tree-remove-b-B 
+   git merge-tree tree-remove-b-initial tree-remove-b-initial 
tree-remove-b-B actual 
+   test_cmp expect actual
+'
+
 test_expect_success 'turn file to tree' '
git reset --hard initial 
rm initial-file 
-- 
1.8.3.rc0.149.g98a72f2.dirty

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] contrib/hooks/post-receive-email: get description from repo.git/config

2013-05-06 Thread Trond Hasle Amundsen
Trond Hasle Amundsen t.h.amund...@usit.uio.no writes:

 The included patch attempts to improve post-receive-email. It's a
 trivial change, but one that helps us Gitolite users. Comments are
 welcome, and this is my first attempt at contributing to the Git
 project. Please keep me on CC as I'm not on the list.

Bah.. It seems I attached this as MIME, sorry about that. Here it is,
properly inlined this time:

From 878a7af9088e2bcc3afc9b09b9023f1f188c844b Mon Sep 17 00:00:00 2001
From: Trond Hasle Amundsen t.h.amund...@usit.uio.no
Date: Mon, 6 May 2013 15:41:25 +0200
Subject: [PATCH] contrib/hooks/post-receive-email: get description from 
repo.git/config

When getting the project description, we first try gitweb.description
entry in repo.git/config, but repo.git/description takes precedence if
it exists. This behaviour mimics that of Gitweb, and is what we want
when using Gitolite, which deletes repo.git/description upon repo
creation and rather maintains a gitweb.description entry in
repo.git/config if a description is configured.

Signed-off-by: Trond Hasle Amundsen t.h.amund...@usit.uio.no
---
 contrib/hooks/post-receive-email |9 -
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/contrib/hooks/post-receive-email b/contrib/hooks/post-receive-email
index 0e5b72d..6ce046a 100755
--- a/contrib/hooks/post-receive-email
+++ b/contrib/hooks/post-receive-email
@@ -714,7 +714,14 @@ if [ -z $GIT_DIR ]; then
exit 1
 fi
 
-projectdesc=$(sed -ne '1p' $GIT_DIR/description 2/dev/null)
+# Get the repo's description. First try gitweb.description entry in
+# repo.git/config, but repo.git/description takes precedence if it
+# exists. This behaviour mimics that of Gitweb.
+projectdesc=$(git config gitweb.description)
+if [ -f $GIT_DIR/description ]; then
+projectdesc=$(sed -ne '1p' $GIT_DIR/description 2/dev/null)
+fi
+
 # Check if the description is unchanged from it's default, and shorten it to
 # a more manageable length if it is
 if expr $projectdesc : Unnamed repository.*$ /dev/null
-- 
1.7.1


Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


offtopic: ppg design decisions - encapsulation

2013-05-06 Thread Martin Langhoff
[ Unashamedly offtopic... asking here because I like git design and
coding style, and ppg is drawing plenty of inspiration from the old
git shell scripts. Please kindly flame me privately... ]

ppg is a wrapper around git to maintain and distribute Puppet configs,
adding a few niceties.

Now, ppg will actuall manage two git repositories -- one for the
puppet configs, the second one for ferrying back the puppet run
results to the originating repo (were they get loaded in a puppet
dashboard server for cute webbased reporting). The puppet config repo
is a normally-behaved git repo. The reports repo is a bit of a hack
-- never used directly by the user, it will follow a store-and-forward
scheme, where I should trim old history or just use something other
than git.

So I see two possible repo layouts:

- Transparent, nested
 .git/ # holding puppet configs, allows direct use of git commands
 .git/reports.git # nested inside puppet configs repo

- Mediated, parallel
 .ppg/puppet.git # all git commands must be wrapped
 .ppg/reports.git

My laziness and laisses-faire take on things drives to to use the
transparentnested approach. Let the user do anything in there
directly with git.

OTOH, the mediated approach allows for more complete support,
including sanity checks on commands that don't have hooks available. I
already have a /usr/bin/ppg wrapper, which I could use to wrap all git
commands, setting GIT_DIR=.ppg/puppet.git for all ops. It would force
ops to be from the top of the tree (unless I write explicit support)
and I would have to implement explicit. And it would break related
tools that are not mediated via /usr/bin/git (gitk!).

Written this way, it seems to be a minimal lazy approach vs DTRT.

Am I missing any important aspect or option? Thoughts?

cheers,



m
--
 martin.langh...@gmail.com
 -  ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 ~ http://docs.moodle.org/en/User:Martin_Langhoff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] contrib/hooks/post-receive-email: get description from repo.git/config

2013-05-06 Thread Junio C Hamano
Trond Hasle Amundsen t.h.amund...@usit.uio.no writes:

 Hello,

 The included patch attempts to improve post-receive-email. It's a

Please don't ;-)

Eh, actually, thanks for the patch.

But when you send a patch the next time around, please have the
above and the next three lines (i.e. introductory text) _below_
the three-dash line.

 trivial change, but one that helps us Gitolite users. Comments are
 welcome, and this is my first attempt at contributing to the Git
 project. Please keep me on CC as I'm not on the list.


 From 878a7af9088e2bcc3afc9b09b9023f1f188c844b Mon Sep 17 00:00:00 2001
 From: Trond Hasle Amundsen t.h.amund...@usit.uio.no
 Date: Mon, 6 May 2013 15:41:25 +0200
 Subject: [PATCH] contrib/hooks/post-receive-email: get description from 
 repo.git/config

And remove these five lines above.  We will read the authorship and
subject from the e-mail header of your message.


 When getting the project description, we first try gitweb.description
 entry in repo.git/config, but repo.git/description takes precedence if
 it exists. This behaviour mimics that of Gitweb, and is what we want
 when using Gitolite, which deletes repo.git/description upon repo
 creation and rather maintains a gitweb.description entry in
 repo.git/config if a description is configured.

 Signed-off-by: Trond Hasle Amundsen t.h.amund...@usit.uio.no
 ---
  contrib/hooks/post-receive-email |9 -
  1 files changed, 8 insertions(+), 1 deletions(-)

 diff --git a/contrib/hooks/post-receive-email 
 b/contrib/hooks/post-receive-email
 index 0e5b72d..6ce046a 100755
 --- a/contrib/hooks/post-receive-email
 +++ b/contrib/hooks/post-receive-email
 @@ -714,7 +714,14 @@ if [ -z $GIT_DIR ]; then
   exit 1
  fi
  
 -projectdesc=$(sed -ne '1p' $GIT_DIR/description 2/dev/null)
 +# Get the repo's description. First try gitweb.description entry in
 +# repo.git/config, but repo.git/description takes precedence if it
 +# exists. This behaviour mimics that of Gitweb.
 +projectdesc=$(git config gitweb.description)
 +if [ -f $GIT_DIR/description ]; then
 +projectdesc=$(sed -ne '1p' $GIT_DIR/description 2/dev/null)
 +fi
 +
  # Check if the description is unchanged from it's default, and shorten it to
  # a more manageable length if it is
  if expr $projectdesc : Unnamed repository.*$ /dev/null

If description file takes precedence, then the right order to do
this would be

projectdesc=$(sed -ne 1p $GIT_DIR/description 2/dev/null)
if expr $projectdesc : Unnamed repository /dev/null
then
: use it as is
elif projectdesc=$(git config gitweb.description)
then
: use it as is
else
projectdesc=UNNAMED PROJECT
fi

to avoid calling git config when it is not even necessary.

We can obviously shorten it by making it less readable, e.g.

projectdesc=$(sed -ne 1p $GIT_DIR/description 2/dev/null)

! expr $projectdesc : Unnamed repository /dev/null ||
projectdesc=$(git config gitweb.description) ||
projectdesc=UNNAMED PROJECT

but I do not think we want to go in that direction ;-)
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] contrib/hooks/post-receive-email: get description from repo.git/config

2013-05-06 Thread Matthieu Moy
Junio C Hamano gits...@pobox.com writes:

 Trond Hasle Amundsen t.h.amund...@usit.uio.no writes:

 Hello,

 The included patch attempts to improve post-receive-email. It's a

 Please don't ;-)

More precisely: you should have a look at git-multimail (in directory
contrib/, in branch for now pu/, or from
https://github.com/mhagger/git-multimail) before spending time on
post-receive-email.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] fast-export: improve speed by skipping blobs

2013-05-06 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 By discarding marks on blobs, we may be robbing some optimization
 possibilities, and by discarding marks on tags, we may be robbing
 some features, from users of fast-export; we might want to add an
 option --use-object-marks={blob,commit,tag} or something to both
 fast-export and fast-import, so that the former can optionally write
 marks for non-commits out, and the latter can omit non commit marks
 if the user do not need them. But that is a separate issue.

s/--use-object-marks/--persistent-object-marks/; I think that would
express the issue better.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] fast-export: improve speed by skipping blobs

2013-05-06 Thread Jeff King
On Mon, May 06, 2013 at 08:08:45AM -0700, Junio C Hamano wrote:

  I'm also not sure why your claim we don't care about blobs is true,
  because naively we would want future runs of fast-export to avoid having
  to write out the whole blob content when mentioning the blob again.
 
 The existing documentation is fairly clear that marks for objects
 other than commits are not exported, and the import-marks codepath
 discards anything but commits, so there is no mechanism for the
 existing fast-export users to leave blob marks in the marks file for
 later runs of fast-export to take advantage of.  The second
 invocation cannot refer to such a blob in the first place.

OK. If the argument is we do not write them, so do not bother reading
them back in, I think that is reasonable. It could hurt anybody trying
to run fast-export against a marks file created by somebody else, but
that is also the same case that is being helped here (since otherwise,
we would not be seeing blob entries at all).

I do not offhand know enough about the internals of import/export-style
remote-helpers to say whether the hurt case even exists, let alone how
common it is.

 By discarding marks on blobs, we may be robbing some optimization
 possibilities, and by discarding marks on tags, we may be robbing
 some features, from users of fast-export; we might want to add an
 option --use-object-marks={blob,commit,tag} or something to both
 fast-export and fast-import, so that the former can optionally write
 marks for non-commits out, and the latter can omit non commit marks
 if the user do not need them. But that is a separate issue.

Yeah, that would allow the old behavior (and more) if anybody is hurt by
this. It is nice if the order of implementation is more features, then
flip the default because it provides an immediate escape hatch for
anybody who is hurt by the change in default. But again, I do not know
enough to say whether such hurt cases even exist.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] contrib/hooks/post-receive-email: get description from repo.git/config

2013-05-06 Thread Trond Hasle Amundsen
Junio C Hamano gits...@pobox.com writes:

 But when you send a patch the next time around, please have the
 above and the next three lines (i.e. introductory text) _below_
 the three-dash line.

Allright, noted.

 From 878a7af9088e2bcc3afc9b09b9023f1f188c844b Mon Sep 17 00:00:00 2001
 From: Trond Hasle Amundsen t.h.amund...@usit.uio.no
 Date: Mon, 6 May 2013 15:41:25 +0200
 Subject: [PATCH] contrib/hooks/post-receive-email: get description from 
 repo.git/config

 And remove these five lines above.  We will read the authorship and
 subject from the e-mail header of your message.

So many rules.. ;) Also noted.

 +projectdesc=$(git config gitweb.description)
 +if [ -f $GIT_DIR/description ]; then
 +projectdesc=$(sed -ne '1p' $GIT_DIR/description 2/dev/null)
 +fi
 +
  # Check if the description is unchanged from it's default, and shorten it to
  # a more manageable length if it is
  if expr $projectdesc : Unnamed repository.*$ /dev/null

 If description file takes precedence, then the right order to do
 this would be

 projectdesc=$(sed -ne 1p $GIT_DIR/description 2/dev/null)
 if expr $projectdesc : Unnamed repository /dev/null
 then
 : use it as is
 elif projectdesc=$(git config gitweb.description)
 then
 : use it as is
 else
 projectdesc=UNNAMED PROJECT
 fi

 to avoid calling git config when it is not even necessary.

That doesn't work, you'll always call git config unless the string
matches Unnamed repository. If you negate the expr line it still
doesn't work. To avoid calling git config I'd rather suggest something
like this:

  projectdesc=$(sed -ne 1p $GIT_DIR/description 2/dev/null)
  if [ -z $projectdesc ]; then
  projectdesc=$(git config gitweb.description)
  fi

And let this block remain intact:

  if expr $projectdesc : Unnamed repository.*$ /dev/null
  then
  projectdesc=UNNAMED PROJECT
  fi

The only change would then be the three added lines containing the if
block that calls git config if the projectdesc variable is
empty. The idea is just to get the description from config if the
description file doesn't exist.

Just curious.. why would we avoid calling git config unless necessary?

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] contrib/hooks/post-receive-email: get description from repo.git/config

2013-05-06 Thread Trond Hasle Amundsen
Matthieu Moy matthieu@grenoble-inp.fr writes:

 The included patch attempts to improve post-receive-email. It's a

 Please don't ;-)

 More precisely: you should have a look at git-multimail (in directory
 contrib/, in branch for now pu/, or from
 https://github.com/mhagger/git-multimail) before spending time on
 post-receive-email.

Thanks for the tip, I'll check it out. I used post-receive-email out of
convenience, as it was already there.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] fast-export: improve speed by skipping blobs

2013-05-06 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 On Mon, May 06, 2013 at 08:08:45AM -0700, Junio C Hamano wrote:

  I'm also not sure why your claim we don't care about blobs is true,
  because naively we would want future runs of fast-export to avoid having
  to write out the whole blob content when mentioning the blob again.
 
 The existing documentation is fairly clear that marks for objects
 other than commits are not exported, and the import-marks codepath
 discards anything but commits, so there is no mechanism for the
 existing fast-export users to leave blob marks in the marks file for
 later runs of fast-export to take advantage of.  The second
 invocation cannot refer to such a blob in the first place.

 OK. If the argument is we do not write them, so do not bother reading
 them back in, I think that is reasonable.

The way I read builtin/fast-export.c::import_marks() is that it is
more like we do not write them, and we do not read them back in
either IN THE CURRENT CODE.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] fast-export: improve speed by skipping blobs

2013-05-06 Thread Jeff King
On Mon, May 06, 2013 at 09:32:41AM -0700, Junio C Hamano wrote:

  OK. If the argument is we do not write them, so do not bother reading
  them back in, I think that is reasonable.
 
 The way I read builtin/fast-export.c::import_marks() is that it is
 more like we do not write them, and we do not read them back in
 either IN THE CURRENT CODE.

Ahh...I see now. It is not about skipping the blobs as a new behavior,
but rather about skipping them _earlier_, before we have loaded the
object contents from disk.

I took the we don't care about as the general use of fast-export does
not care about, but it is we will literally just drop them a few lines
later.

So yes, I think this is an obviously correct optimization. Thanks for
clarifying, and sorry to be so slow.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] l10n: de.po: translate 44 new messages

2013-05-06 Thread Ralf Thielow
Translate 44 new messages came from git.pot update in
c6bc7d4 (l10n: git.pot: v1.8.3 round 2 (44 new, 12 removed)).

Signed-off-by: Ralf Thielow ralf.thie...@gmail.com
---
 po/de.po | 139 +++
 1 file changed, 77 insertions(+), 62 deletions(-)

diff --git a/po/de.po b/po/de.po
index c823661..4073b42 100644
--- a/po/de.po
+++ b/po/de.po
@@ -134,74 +134,76 @@ msgstr 
 Benutzen Sie '\\!' für führende Ausrufezeichen.
 
 #: branch.c:60
-#, fuzzy, c-format
+#, c-format
 msgid Not setting branch %s as its own upstream.
-msgstr externer Zweig %s nicht im anderen Projektarchiv %s gefunden
+msgstr Zweig %s kann nicht sein eigener Ãœbernahmezweig sein.
 
 #: branch.c:82
 #, c-format
 msgid Branch %s set up to track remote branch %s from %s by rebasing.
-msgstr 
+msgstr Zweig %s konfiguriert zum Folgen von externem Zweig %s von %s durch 
+Neuaufbau.
 
 #: branch.c:83
 #, c-format
 msgid Branch %s set up to track remote branch %s from %s.
-msgstr 
+msgstr Zweig %s konfiguriert zum Folgen von externem Zweig %s von %s.
 
 #: branch.c:87
 #, c-format
 msgid Branch %s set up to track local branch %s by rebasing.
-msgstr 
+msgstr Zweig %s konfiguriert zum Folgen von lokalem Zweig %s durch Neuaufbau.
 
 #: branch.c:88
 #, c-format
 msgid Branch %s set up to track local branch %s.
-msgstr 
+msgstr Zweig %s konfiguriert zum Folgen von lokalem Zweig %s.
 
 #: branch.c:92
 #, c-format
 msgid Branch %s set up to track remote ref %s by rebasing.
-msgstr 
+msgstr Zweig %s konfiguriert zum Folgen von externer Referenz %s durch 
+Neuaufbau.
 
 #: branch.c:93
 #, c-format
 msgid Branch %s set up to track remote ref %s.
-msgstr 
+msgstr Zweig %s konfiguriert zum Folgen von externer Referenz %s.
 
 #: branch.c:97
 #, c-format
 msgid Branch %s set up to track local ref %s by rebasing.
-msgstr 
+msgstr Zweig %s konfiguriert zum Folgen von lokaler Referenz %s durch 
+Neuaufbau.
 
 #: branch.c:98
 #, c-format
 msgid Branch %s set up to track local ref %s.
-msgstr 
+msgstr Zweig %s konfiguriert zum Folgen von lokaler Referenz %s.
 
 #: branch.c:118
-#, fuzzy, c-format
+#, c-format
 msgid Tracking not set up: name too long: %s
-msgstr Markierungsname zu lang: %.*s...
+msgstr Konfiguration zum Folgen von Zweig nicht eingerichtet. Name zu lang: 
%s
 
 #: branch.c:137
-#, fuzzy, c-format
+#, c-format
 msgid Not tracking: ambiguous information for ref %s
-msgstr Kein externer Übernahmezweig für %s von %s
+msgstr Konfiguration zum Folgen von Zweig nicht eingerichtet. Referenz %s ist 
mehrdeutig.
 
 #: branch.c:182
-#, fuzzy, c-format
+#, c-format
 msgid '%s' is not a valid branch name.
-msgstr '%s' ist kein gültiger Markierungsname.
+msgstr '%s' ist kein gültiger Zweigname.
 
 #: branch.c:187
-#, fuzzy, c-format
+#, c-format
 msgid A branch named '%s' already exists.
-msgstr Arbeitsbaum '%s' existiert bereits.
+msgstr Zweig '%s' existiert bereits.
 
 #: branch.c:195
-#, fuzzy
 msgid Cannot force update the current branch.
-msgstr bezieht den aktuellen Zweig ein
+msgstr Kann Aktualisierung des aktuellen Zweiges nicht erzwingen.
 
 #: branch.c:201
 #, c-format
@@ -237,29 +239,27 @@ msgstr 
 beim Versand zu konfigurieren.
 
 #: branch.c:250
-#, fuzzy, c-format
+#, c-format
 msgid Not a valid object name: '%s'.
-msgstr %s ist kein gültiger Objekt-Name
+msgstr Ungültiger Objekt-Name: '%s'
 
 #: branch.c:270
-#, fuzzy, c-format
+#, c-format
 msgid Ambiguous object name: '%s'.
-msgstr fehlerhafter Objekt-Name '%s'
+msgstr mehrdeutiger Objekt-Name: '%s'
 
 #: branch.c:275
-#, fuzzy, c-format
+#, c-format
 msgid Not a valid branch point: '%s'.
-msgstr Ungültiger Zweig-Name: '%s'
+msgstr Ungültige Zweig-Version: '%s'
 
 #: branch.c:281
-#, fuzzy
 msgid Failed to lock ref for update
-msgstr Fehler beim Erstellen der Ausgabedateien.
+msgstr Fehler beim Sperren der Referenz zur Aktualisierung.
 
 #: branch.c:299
-#, fuzzy
 msgid Failed to write ref
-msgstr konnte %s nicht schreiben
+msgstr Fehler beim Schreiben der Referenz.
 
 #: bundle.c:36
 #, c-format
@@ -1161,14 +1161,14 @@ msgid Could not format %s.
 msgstr Konnte %s nicht formatieren.
 
 #: sequencer.c:1083
-#, fuzzy, c-format
+#, c-format
 msgid %s: can't cherry-pick a %s
-msgstr Kann \cherry-pick\ nicht in einem leeren Zweig ausführen.
+msgstr %s: %s kann nicht in \cherry-pick\ benutzt werden
 
 #: sequencer.c:1085
 #, c-format
 msgid %s: bad revision
-msgstr 
+msgstr %s: ungültige Revision
 
 #: sequencer.c:1119
 msgid Can't revert as initial commit
@@ -1561,6 +1561,9 @@ msgid 
 may speed it up, but you have to be careful not to forget to add\n
 new files yourself (see 'git help status').
 msgstr 
+Es dauerte %.2f Sekunden die unbeobachteten Dateien zu bestimmen.\n
+'status -uno' könnte das beschleunigen, aber Sie müssen darauf achten,\n
+neue Dateien selbstständig hinzuzufügen (siehe 'git help status').
 
 #: wt-status.c:1232
 #, c-format
@@ -1665,7 +1668,7 @@ msgstr git add [Optionen] [--] [Pfadspezifikation...]
 #. * eventually we can drop 

Re: [PATCH 0/7] Make $remote/$branch work with unconventional refspecs

2013-05-06 Thread Junio C Hamano
Johan Herland jo...@herland.net writes:

 Let me try to summarize my views on how refnames should work in Git, to
 see if we can identify where we differ on the principles (or if we, in
 fact, differ at all):

Thanks; I think I already said where I think we differ in a separate
message, but a short version is that the point of remote.$nick.fetch
mapping is to solve The remote may call a ref $this, which is not
the refname I want to or can use in my repository, so here is the
rule to use when importing it in my local namespace.  With the
mapping, I can name the ref in my local namespace conveniently.
E.g. their refs/heads/master cannot be our refs/heads/master at
the same time, so use refs/heads/origin/master.

The result of the above mapping, be it remotes/origin/master or
remotes/origin/heads/master, should be designed to be useful for the
local use of the ref in question.  If you further need to remap it
when using it locally, there is something wrong in the mapping you
defined in your remote.$nick.fetch mapping in the first place.

We do not force any structure under refs/remotes/; it is left
entirely up to the user, even though we would like to suggest the
best current practice by teaching clone and remote add to lay
them out in a certain way.

Another thing is that refs/remotes/ is not special at all.  If notes
hierarchies taken from a remote need to be somewhere other than
refs/notes/, it is perfectly fine to introduce refs/remote-notes/ if
that is the best layout when using them locally.  What is special is
refs/heads/ in that they are the _only_ refs you can check out to
the working tree and directly advance them by working on the working
tree files.

 I would support disallowing multi-level remote names, although I don't
 know if it is commonly used, and would break many existing users.

I somewhat doubt it.

We very much anticipated the use of multi-level branch names from
the very beginning and have support (e.g. in for-each-ref and
branch --list) to group/filter them according to prefixes, but I
do not think there is anywhere we consciously try to give support
for multi-level remote names to treat groups of remotes that share
the same prefix.

 *2* Perhaps bar in the above is spelled topics, and the
 hierarchy may be used to collect non-integration single topic
 branches from more than one remote.  An example that is more in
 line with such a usage might be:

 [remote jh]
 fetch = +refs/heads/*:refs/remotes/topics/heads/jh/*
 [remote jk]
 fetch = +refs/heads/*:refs/remotes/topics/heads/jk/*
 [remote fc]
 fetch = +refs/heads/*:refs/remotes/topics/heads/fc/*

 and I would expect git merge topics/jh/rbranch to merge the
 refs/remotes/topics/heads/jh/rbranch topic branch.

 I like the use case, but not necessarily your expectation. ;-)

 With the above configuration, and my series as-is, you could simply do
 git merge jh/rbranch to merge the refs/remotes/topics/heads/jh/rbranch
 topic branch.

That dropping of 'topics/' is the issue.  The user wanted to group
them under 'topics/' hierarchy and made a conscous effort to set up
the fetch refspec to map these refs there.  These are done all for
convenience when she deals with refs in her namespace in the
repository.  What justification do we have to second guess the user
and force her to drop it when naming these refs?

 Furthermore, I don't see why you want/need the extra
 heads/ level in the refspec.

Just like you wanted to have separate kinds of refs under a single
remote, the layout is grouping kinds of refs other than branch heads
related to the topics (as opposed to integration branches).
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] Make $remote/$branch work with unconventional refspecs

2013-05-06 Thread Junio C Hamano
Santi Béjar sa...@agolina.net writes:

 The next question could be: why not make it work with $url too? As in:

 $ git merge git://git.kernel.org/pub/scm/git/git.git//master

You need to remember merge is a local operation, working in your
local repository.  Like it or not, the UI of Git makes distinction
between operations that are local and those that go to other
repositories (e.g. git pull).

That of course does _not_ prevent you from writing an alternative UI
got merge anything that interprets anthing part and invokes
git merge or git pull as its helper (after all, Git is designed
to be scripted).
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] fast-export: improve speed by skipping blobs

2013-05-06 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 So yes, I think this is an obviously correct optimization. Thanks for
 clarifying, and sorry to be so slow.

No need to be sorry.  It just shows that the log message could have
been more helpful.

Here is what I tentatively queued.

commit 83582e91d22c66413b291d4d6d45bbeafddc2af9
Author: Felipe Contreras felipe.contre...@gmail.com
Date:   Sun May 5 17:38:53 2013 -0500

fast-export: do not parse non-commit objects while reading marks file

We read from the marks file and keep only marked commits, but in
order to find the type of object, we are parsing the whole thing,
which is slow, specially in big repositories with lots of big files.

There's no need for that, we can query the object information with
sha1_object_info().

Before this, loading the objects of a fresh emacs import, with 260598
blobs took 14 minutes, after this patch, it takes 3 seconds.

This is the way fast-import does it. Also die if the object is not
found (like fast-import).

Signed-off-by: Felipe Contreras felipe.contre...@gmail.com
Signed-off-by: Junio C Hamano gits...@pobox.com
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] fast-export: improve speed by skipping blobs

2013-05-06 Thread Jeff King
On Mon, May 06, 2013 at 10:17:41AM -0700, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:
 
  So yes, I think this is an obviously correct optimization. Thanks for
  clarifying, and sorry to be so slow.
 
 No need to be sorry.  It just shows that the log message could have
 been more helpful.
 
 Here is what I tentatively queued.
 [...]

Yeah, that is much for to understand (to me, at least).

Thanks.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] Make $remote/$branch work with unconventional refspecs

2013-05-06 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

Just a few typofixes...

 Johan Herland jo...@herland.net writes:

 Let me try to summarize my views on how refnames should work in Git, to
 see if we can identify where we differ on the principles (or if we, in
 fact, differ at all):

 Thanks; I think I already said where I think we differ in a separate
 message, but a short version is that the point of remote.$nick.fetch
 mapping is to solve The remote may call a ref $this, which is not
 the refname I want to or can use in my repository, so here is the
 rule to use when importing it in my local namespace.  With the
 mapping, I can name the ref in my local namespace conveniently.
 E.g. their refs/heads/master cannot be our refs/heads/master at
 the same time, so use refs/heads/origin/master.

The last should be refs/remotes/origin/master.


 The result of the above mapping, be it remotes/origin/master or
 remotes/origin/heads/master, should be designed to be useful for the
 local use of the ref in question.  If you further need to remap it
 when using it locally, there is something wrong in the mapping you
 defined in your remote.$nick.fetch mapping in the first place.

 We do not force any structure under refs/remotes/; it is left
 entirely up to the user, even though we would like to suggest the
 best current practice by teaching clone and remote add to lay
 them out in a certain way.

 Another thing is that refs/remotes/ is not special at all.  If notes
 hierarchies taken from a remote need to be somewhere other than
 refs/notes/, it is perfectly fine to introduce refs/remote-notes/ if
 that is the best layout when using them locally.  What is special is

s/the best/the most convenient/;

 refs/heads/ in that they are the _only_ refs you can check out to
 the working tree and directly advance them by working on the working
 tree files.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] fast-export: improve speed by skipping blobs

2013-05-06 Thread Jeff King
On Mon, May 06, 2013 at 01:19:35PM -0400, Jeff King wrote:

  Here is what I tentatively queued.
  [...]
 
 Yeah, that is much for to understand (to me, at least).

Ugh. That was supposed to be much easier to understand. Perhaps I will
learn to type one day.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/7] shorten_unambiguous_ref(): Allow shortening refs/remotes/origin/HEAD to origin

2013-05-06 Thread Junio C Hamano
Johan Herland jo...@herland.net writes:

 ... there is AFAICS _no_ way for sscanf() - having
 already done one or more format extractions - to indicate to its caller
 that the input fails to match the trailing part of the format string.

Yeah, we can detect when we did not have enough, but we cannot tell
where it stopped matching.

It is interesting that this bug has stayed so long with us, which
may indicate that nobody actually uses the feature at all.

Good eyes.


 Cc: Bert Wesarg bert.wes...@googlemail.com
 Signed-off-by: Johan Herland jo...@herland.net
 ---
  refs.c  | 82 
 +++--
  t/t6300-for-each-ref.sh | 12 
  2 files changed, 43 insertions(+), 51 deletions(-)

 diff --git a/refs.c b/refs.c
 index d17931a..7231f54 100644
 --- a/refs.c
 +++ b/refs.c
 @@ -2945,80 +2945,60 @@ struct ref *find_ref_by_name(const struct ref *list, 
 const char *name)
   return NULL;
  }
  
 +int shorten_ref(const char *refname, const char *pattern, char *short_name)

Does this need to be an extern?

  {
 + /*
 +  * pattern must be of the form [pre]%.*s[post]. Check if refname
 +  * starts with [pre] and ends with [post]. If so, write the
 +  * middle part into short_name, and return the number of chars
 +  * written (not counting the added NUL-terminator). Otherwise,
 +  * if refname does not match pattern, return 0.
 +  */
 + size_t pre_len, post_start, post_len, match_len;
 + size_t ref_len = strlen(refname);
 + char *sep = strstr(pattern, %.*s);
 + if (!sep || strstr(sep + 4, %.*s))
 + die(invalid pattern in ref_rev_parse_rules: %s, pattern);
 + pre_len = sep - pattern;
 + post_start = pre_len + 4;
 + post_len = strlen(pattern + post_start);
 + if (pre_len + post_len = ref_len)
 + return 0; /* refname too short */
 + match_len = ref_len - (pre_len + post_len);
 + if (strncmp(refname, pattern, pre_len) ||
 + strncmp(refname + ref_len - post_len, pattern + post_start, 
 post_len))
 + return 0; /* refname does not match */
 + memcpy(short_name, refname + pre_len, match_len);
 + short_name[match_len] = '\0';
 + return match_len;
  }

OK. Looks correct, even though I suspect some people might come up
with a more concise way to express the above.

  char *shorten_unambiguous_ref(const char *refname, int strict)
  {
   int i;
   char *short_name;
  
   /* skip first rule, it will always match */
 - for (i = nr_rules - 1; i  0 ; --i) {
 + for (i = ARRAY_SIZE(ref_rev_parse_rules) - 1; i  0 ; --i) {
   int j;
   int rules_to_fail = i;
   int short_name_len;
  
 + if (!ref_rev_parse_rules[i] ||

What is this skippage about?  Isn't it what you already compensated
away by starting from ARRAY_SIZE() - 1?

Ahh, no.  But wait.  Isn't there a larger issue here?

 + !(short_name_len = shorten_ref(refname,
 +ref_rev_parse_rules[i],
 +short_name)))
   continue;
  
 - short_name_len = strlen(short_name);
 -
   /*
* in strict mode, all (except the matched one) rules
* must fail to resolve to a valid non-ambiguous ref
*/
   if (strict)
 - rules_to_fail = nr_rules;
 + rules_to_fail = ARRAY_SIZE(ref_rev_parse_rules);

Isn't nr_rules in the original is ARRAY_SIZE()-1?

  
   /*
* check if the short name resolves to a valid ref,

Could you add a test to trigger the strict codepath?

Thanks.

 diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
 index 752f5cb..57e3109 100755
 --- a/t/t6300-for-each-ref.sh
 +++ b/t/t6300-for-each-ref.sh
 @@ -466,4 +466,16 @@ test_expect_success 'Verify sort with multiple keys' '
   refs/tags/bogo refs/tags/master  actual 
   test_cmp expected actual
  '
 +
 +cat expected \EOF
 +origin
 +origin/master
 +EOF
 +
 +test_expect_success 'Check refs/remotes/origin/HEAD shortens to origin' '
 + git remote set-head origin master 
 + git for-each-ref --format=%(refname:short) refs/remotes actual 
 + test_cmp expected actual
 +'
 +
  test_done
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Bug: git-p4: Sometimes p4 generates Windows-style output on OS X

2013-05-06 Thread David Foster
I've observed that the p4 command that git-p4 delegates to occasionally 
outputs Windows-style line endings even on the OS X platform. When this 
happens, git-p4 gets very confused and crashes out.


I've attached a patch which seems to fix the issue in my case. Now this 
patch is a pretty bad hack, and I don't recommend that it be accepted 
as-is. It is just a starting point.


A real fix would determine in advance whether Perforce was going to emit 
Windows-style output. Since I don't know the circumstances under which 
this happens on non-Windows platforms, I can't provide a better patch. 
Someone who has intimate knowledge of p4's operating modes would be best 
to examine what's really going on with p4.


P.S. In case it matters, I am not subscribed to this mailing list, so 
you will need to CC me for any replies to reach me.


---
David Foster
http://dafoster.net/
From aef963f0c45dea81f3e6f30d3b4185a0983ca4de Mon Sep 17 00:00:00 2001
From: David Foster davidf...@gmail.com
Date: Mon, 6 May 2013 10:50:01 -0700
Subject: [PATCH] Compensate for Windows-style output from the p4 command on
 non-Windows systems.

---
 git-p4.py |   18 ++
 1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 647f110..949d66d 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -1454,6 +1454,24 @@ class P4Submit(Command, P4UserMap):
 tmpFile = open(fileName, rb)
 message = tmpFile.read()
 tmpFile.close()
+
+# HACK: If Perforce spontaneously generates Windows-style output,
+#   compensate by assuming the entire p4 command went into
+#   Windows mode.
+if separatorLine not in message:
+print WARNING: Perforce has spontaneously decided to generate 
Windows-style output. Compensating.
+
+# Assume that Perforce is now inexplicably operating in 
Windows mode
+self.isWindows = True
+
+# Retroactively rewrite expected output
+submitTemplate = submitTemplate.replace(\n, \r\n)
+separatorLine = separatorLine.replace(\n, \r\n)
+newdiff = newdiff.replace(\n, \r\n)
+
+if separatorLine not in message:
+raise ValueError('Confused. Thought Perforce went into 
Windows mode but apparently something else is wrong.')
+
 submitTemplate = message[:message.index(separatorLine)]
 if self.isWindows:
 submitTemplate = submitTemplate.replace(\r\n, \n)
-- 
1.7.7.5 (Apple Git-26)



Re: another packed-refs race

2013-05-06 Thread Jeff King
On Mon, May 06, 2013 at 02:03:40PM +0200, Michael Haggerty wrote:

  We could fix this by making sure our packed-refs file is up to date
 
 s/file/cache/

Yeah, I meant our view of the packed-refs file, but I think cache
says that more succinctly. I'll be sure to make it clear when I start
writing real commit messages.

 Let me think out loud alongside your analysis...
 
 By this mechanism the reader can ensure that it never uses a version of
 the packed-refs file that is older than its information that the
 corresponding loose ref is absent from the filesystem.

Yes, that's a good way of saying it.

 This is all assuming that the filesystem accesses have a defined order;
 how is that guaranteed?  pack_refs() and commit_ref() both rely on
 commit_lock_file(), which calls
 
 close(fd) on the lockfile
 rename(lk-filename, result_file)
 
 prune_ref() locks the ref, verifies that its SHA-1 is unchanged, then
 calls unlink(), then rollback_lock_file().
 
 The atomicity of rename() guarantees that a reader sees either the old
 or the new version of the file in question.  But what guarantees are
 there about accesses across two files?

I don't know offhand if any standard makes such guarantees. But there
are other parts of git that do depend on that ordering. For example, I
create a loose object representing commit X. Then I update a ref to
point to X. If the ordering of those operations is not preserved, then a
simultaneous reader would think that the repository is corrupted (the
ref points to a missing object).

I would not be surprised if there are distributed or networked
filesystems for which this property does not hold. But I suspect it does
hold for operations on a single machine with a decent kernel (I would
imagine the VFS layer takes care of this). I am just basing my suspicion
on experimental results (the patch I sent does seem to work on my ext4
system).

 The loose refs cache is only used by the for_each_ref() functions, not
 for looking up individual references.  Another approach would be to
 change the top-level for_each_ref() functions to re-stat() all of the
 loose references within the namespace that interests it, *then* verify
 that the packed-ref cache is not stale, *then* start the iteration.
 Then there would be no need to re-stat() files during the iteration.
 (This would mean that we have to prohibit a second reference iteration
 from being started while one is already in progress.)

Hmm. Thinking on this more, I'm not sure that we need to stat the loose
references at all. We do not need to care if the loose refs are up to
date or not. Well, we might care, but the point here is not to pretend
that we have an up-to-date atomic view of the loose refs; it is only to
make sure that the fallback-to-packed behavior does not lie to us about
the existence or value of a ref.

IOW, it is OK to come up with a value for ref X that was true at the
beginning of the program, even if it has been simultaneously updated.
Our program can operate as if it happened in the instant it started,
even though in real life it takes longer. But it is _not_ OK to miss the
existence of a ref, or to come up with a value that it did not hold at
some point during the program (e.g., it is not OK to return some cruft
we wrote into the packed-refs file when we packed it three weeks ago).

That is a weaker guarantee, and I think we can provide it with:

  1. Load all loose refs into cache for a particular enumeration.

  2. Make sure the packed-refs cache is up-to-date (by checking its
 stat() information and reloading if necessary).

  3. Run the usual iteration over the loose/packed ref caches.

If a loose ref is updated after or during (1), that's OK. The ref
hierarchy is not atomic, so it is possible for us to see a state that
never existed (e.g., we read X, somebody updates X to X' and Y to Y',
then we read Y'; we see X and Y', a state that never existed on disk).
But either the ref was already loose, in which case we always see its
loose value as it was when we read it, or it was previously only packed
(or non-existent), in which case we see get its value from the
packed-refs (or assume it does not exist), which is its state at the
start of our program. If the loose ref is deleted instead of updated,
it's the same thing; we may see it as existing, as it was at the start
of our program.

If the packed-refs file is updated after we have read all of the loose
refs, we may see it ahead of the loose refs state we have cached. And
that may mean the packed-refs file even has more up-to-date values. But
we don't have to care, as long as we return some value that was valid
during the course of our program. And we do, either because we have the
loose value cached (in which case it trumps the packed-refs version and
we use it), or it was missing (in which case we will use the updated
pack-refs value, which may not even be the most recent value, but is at
least a value that the ref had to hold at some point during the 

Re: another packed-refs race

2013-05-06 Thread Jeff King
On Mon, May 06, 2013 at 02:12:29PM +0200, Michael Haggerty wrote:

  I think that would be correct (modulo that step 1 cannot happen for
  enumeration). But we would like to avoid loading all loose refs if we
  can. Especially on a cold cache, it can be quite slow, and you may not
  even care about those refs for the current operation (I do not recall
  the exact original motivation for the lazy loading, but it was something
  along those lines).
 
 Lazy loading was first inspired by the observation that effectively
 every git invocation was loading *all* loose references to do an
 iteration over refs/replace/ (which I've never even used!)  This was
 absolutely killing the performance of filter-branch, which creates a lot
 of loose references and invokes git many times--even though the cache
 was warm.

Yeah, obviously we want to avoid that. I _think_ we can even keep the
lazy loading, as long as keep the ordering as:

  1. Load a chunk of loose refs (whatever we need for the current
 iteration request).

  2. Double-check that our packed-refs cache is up to date, and reload
 if necessary.

  3. Return the results to the caller.

It would perhaps increase latency to getting results to the caller
(since otherwise we can start feeding answers to the caller as we walk
the tree), but should not increase the total amount of work (just the
extra stat() of packed-refs, once per for_each_ref, which is not much).

I'll see if I can cook up a patch.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] fast-export: improve speed by skipping blobs

2013-05-06 Thread Felipe Contreras
On Mon, May 6, 2013 at 7:31 AM, Jeff King p...@peff.net wrote:
 On Sun, May 05, 2013 at 05:38:53PM -0500, Felipe Contreras wrote:

 We don't care about blobs, or any object other than commits, but in
 order to find the type of object, we are parsing the whole thing, which
 is slow, specially in big repositories with lots of big files.

 I did a double-take on reading this subject line and first paragraph,
 thinking surely fast-export needs to actually output blobs?.

If you think that, then you are not familiar with the code.

--export-marks=file::
Dumps the internal marks table to file when complete.
Marks are written one per line as `:markid SHA-1`. Only marks
for revisions are dumped; marks for blobs are ignored.

if (deco-base  deco-base-type == 1) {
mark = ptr_to_mark(deco-decoration);
if (fprintf(f, :%PRIu32 %s\n, mark,
sha1_to_hex(deco-base-sha1))  0) {
e = 1;
break;
}
}

 Reading the patch, I see that this is only about not bothering to load
 blob marks from --import-marks. It might be nice to mention that in the
 commit message, which is otherwise quite confusing.

The commit message says it exactly like it is: we don't care about blobs.

If an object is not a commit, we *already* skip it. But as the commit
message already says, we do so by parsing the whole thing.

 I'm also not sure why your claim we don't care about blobs is true,
 because naively we would want future runs of fast-export to avoid having
 to write out the whole blob content when mentioning the blob again.

Because it's pointless to have hundreds and thousands of blob marks
that are *never* going to be used, only for an extremely tiny minority
that would.

 Does that match your reasoning?

It doesn't matter, it has been that way since --export-marks was introduced.

 Before this, loading the objects of a fresh emacs import, with 260598
 blobs took 14 minutes, after this patch, it takes 3 seconds.

 Presumably most of that speed improvement comes from not parsing the
 blob objects. I wonder if you could get similar speedups by applying the
 do not bother parsing rule from your patch 3. You would still incur
 some cost to create a struct blob, but it may or may not be
 measurable.  That would mean we get the case not worth worrying about
 from above for free. I doubt it would make that big a difference,
 though, given the rarity of it. So I am OK with it either way.

How would I know if it's a blob or a commit, if not by the code this
patch introduces?

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] fast-export: improve speed by skipping blobs

2013-05-06 Thread Felipe Contreras
On Mon, May 6, 2013 at 10:08 AM, Junio C Hamano gits...@pobox.com wrote:
 Jeff King p...@peff.net writes:

 On Sun, May 05, 2013 at 05:38:53PM -0500, Felipe Contreras wrote:

 We don't care about blobs, or any object other than commits, but in
 order to find the type of object, we are parsing the whole thing, which
 is slow, specially in big repositories with lots of big files.

 I did a double-take on reading this subject line and first paragraph,
 thinking surely fast-export needs to actually output blobs?.

 Reading the patch, I see that this is only about not bothering to load
 blob marks from --import-marks. It might be nice to mention that in the
 commit message, which is otherwise quite confusing.

 I had the same reaction first, but not writing the blob _objects_
 out to the output stream would not make any sense, so it was fairly
 easy to guess what the author wanted to say ;-).

That's how fast-export has worked since --export-marks was introduced.

 I'm also not sure why your claim we don't care about blobs is true,
 because naively we would want future runs of fast-export to avoid having
 to write out the whole blob content when mentioning the blob again.

 The existing documentation is fairly clear that marks for objects
 other than commits are not exported, and the import-marks codepath
 discards anything but commits, so there is no mechanism for the
 existing fast-export users to leave blob marks in the marks file for
 later runs of fast-export to take advantage of.  The second
 invocation cannot refer to such a blob in the first place.

 The story is different on the fast-import side, where we do say we
 dump the full table and a later run can depend on these marks.

Yes, and gaining nothing but increased disk-space.

 By discarding marks on blobs, we may be robbing some optimization
 possibilities, and by discarding marks on tags, we may be robbing
 some features, from users of fast-export; we might want to add an
 option --use-object-marks={blob,commit,tag} or something to both
 fast-export and fast-import, so that the former can optionally write
 marks for non-commits out, and the latter can omit non commit marks
 if the user do not need them. But that is a separate issue.

How? The only way we might rob optimizations is if there's an obscene
amount files, otherwise the number of blob marks that we are
*actually* going to use ever again is extremely tiny.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] fast-export: improve speed by skipping blobs

2013-05-06 Thread Jeff King
On Mon, May 06, 2013 at 02:02:13PM -0500, Felipe Contreras wrote:

  I did a double-take on reading this subject line and first paragraph,
  thinking surely fast-export needs to actually output blobs?.
 
 If you think that, then you are not familiar with the code.
 
 --export-marks=file::
 [...]

My point was that nothing in the subject line nor that first paragraph
(nor, for that matter, the entire commit message) says that we are
talking about marks here.

  Reading the patch, I see that this is only about not bothering to load
  blob marks from --import-marks. It might be nice to mention that in the
  commit message, which is otherwise quite confusing.
 
 The commit message says it exactly like it is: we don't care about blobs.

If you guess that we means the marks code and not all of fast-export,
then yes. But I do not have any desire to get into another debate trying
to convince you that there is value to having a clear commit message.
Junio has already proposed a much more readable one.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] fast-export: improve speed by skipping blobs

2013-05-06 Thread Felipe Contreras
On Mon, May 6, 2013 at 11:20 AM, Jeff King p...@peff.net wrote:
 On Mon, May 06, 2013 at 08:08:45AM -0700, Junio C Hamano wrote:

  I'm also not sure why your claim we don't care about blobs is true,
  because naively we would want future runs of fast-export to avoid having
  to write out the whole blob content when mentioning the blob again.

 The existing documentation is fairly clear that marks for objects
 other than commits are not exported, and the import-marks codepath
 discards anything but commits, so there is no mechanism for the
 existing fast-export users to leave blob marks in the marks file for
 later runs of fast-export to take advantage of.  The second
 invocation cannot refer to such a blob in the first place.

 OK. If the argument is we do not write them, so do not bother reading
 them back in, I think that is reasonable.

We already do that:

5d3698f fast-export: avoid importing blob marks

 It could hurt anybody trying
 to run fast-export against a marks file created by somebody else, but
 that is also the same case that is being helped here (since otherwise,
 we would not be seeing blob entries at all).

 I do not offhand know enough about the internals of import/export-style
 remote-helpers to say whether the hurt case even exists, let alone how
 common it is.

 By discarding marks on blobs, we may be robbing some optimization
 possibilities, and by discarding marks on tags, we may be robbing
 some features, from users of fast-export; we might want to add an
 option --use-object-marks={blob,commit,tag} or something to both
 fast-export and fast-import, so that the former can optionally write
 marks for non-commits out, and the latter can omit non commit marks
 if the user do not need them. But that is a separate issue.

 Yeah, that would allow the old behavior (and more) if anybody is hurt by
 this.

There is no behavior change in this patch. We do *exactly* the same as before.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] fast-export: improve speed by skipping blobs

2013-05-06 Thread Felipe Contreras
On Mon, May 6, 2013 at 2:11 PM, Jeff King p...@peff.net wrote:
 On Mon, May 06, 2013 at 02:02:13PM -0500, Felipe Contreras wrote:

  I did a double-take on reading this subject line and first paragraph,
  thinking surely fast-export needs to actually output blobs?.

 If you think that, then you are not familiar with the code.

 --export-marks=file::
 [...]

 My point was that nothing in the subject line nor that first paragraph
 (nor, for that matter, the entire commit message) says that we are
 talking about marks here.

s/$/ while loading marks/. Fixed.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: Uninit'ed submodules

2013-05-06 Thread Jens Lehmann
Am 06.05.2013 02:19, schrieb Chris Packham:
 This did get me thinking. Why does an uninitialized submodule need to
 have an empty directory? If it didn't the maintainer in question
 probably would have realized that he needed to run git submodule
 update --init when his cd submodule command failed.
 
 I'm guessing there is a good reason for the empty directory - perhaps
 so that git can notice the fact that it exists in the worktree but is
 out of date?  If it does need to have some presence in the worktree
 why not as a file? That way the cd command would still fail (albeit
 with a different error) providing the necessary indication to the
 user. The submodule update --init could then change from file - dir
 when it actually gets populated.

Hmm, to me an empty directory is the natural representation of an
unpopulated submodule, but I see why that made it hard for your
maintainer to notice the fact that the submodule was uninitialized.
I suspect changing an unpopulated submodule to be represented by a
file will surprise quite some users (some of which will probably
come up with perfectly valid use cases such a change will break).

What about the following: Today's Git completely ignores empty
submodule directories, but I think that when the recursive checkout
support is there, the submodule.autoupdate flag - which I believe
should control that behavior - could also make those empty submodule
directories show up in git status as being unpopulated (after all
they are configured to be updated automatically, so not having them
populated is something Git should show). Would something like this
have helped here?

Until then I can only propose to establish a best practice of using
git clone --recurse-submodules in these situations to avoid the
problem you described.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] Make $remote/$branch work with unconventional refspecs

2013-05-06 Thread Santi Béjar
El 06/05/2013 19:11, Junio C Hamano gits...@pobox.com va escriure:

 Santi Béjar sa...@agolina.net writes:

  The next question could be: why not make it work with $url too? As in:
 
  $ git merge git://git.kernel.org/pub/scm/git/git.git//master

 You need to remember merge is a local operation, working in your
 local repository.  Like it or not, the UI of Git makes distinction
 between operations that are local and those that go to other
 repositories (e.g. git pull).


Of course! In fact I wanted to say git pull.

Santi
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 0/7] interactive git clean

2013-05-06 Thread Jiang Xin
Implement a 'git add --interactive' style of interactive git-clean.
It will show what would be done before start to clean.  See
``Interactive mode`` for details.

Interactive mode

When the command enters the interactive mode, it shows the
files and directories to be cleaned, and goes into its
interactive command loop.

The command loop shows the list of subcommands available, and
gives a prompt What now .  In general, when the prompt ends
with a single '', you can pick only one of the choices given
and type return, like this:


*** Commands ***
  1: clean 2: edit by patterns   3: edit by numbers
  4. rm -i 5. quit   6. help
What now 2


You also could say `c` or `clean` above as long as the choice is unique.

The main command loop has 6 subcommands.

clean::

   Start cleaning files and directories, and then quit.

edit by patterns::

   This shows the files and directories to be deleted and issues an
   Input ignore patterns prompt. You can input a space-seperated
   patterns to exclude files and directories from deletion.
   E.g. *.c *.h will excludes files end with .c and .h from
   deletion. When you are satisfied with the filtered result, press
   ENTER (empty) back to the main menu.

edit by numbers::

   This shows the files and directories to be deleted and issues an
   Select items to delete prompt. When the prompt ends with double
   '' like this, you can make more than one selection, concatenated
   with whitespace or comma.  Also you can say ranges.  E.g. 2-5 7,9
   to choose 2,3,4,5,7,9 from the list.  If the second number in a
   range is omitted, all remaining patches are taken.  E.g. 7- to
   choose 7,8,9 from the list.  You can say '*' to choose everything.
   Also when you are satisfied with the filtered result, press ENTER
   (empty) back to the main menu.

rm -i::

  This will show a rm -i style cleaning, that you must confirm one
  by one in order to delete items. This action is not as efficient
  as the above two actions.

quit::

  This lets you quit without do cleaning.

help::

  Show brief usage of interactive git-clean.



Jiang Xin (7):
  Add support for -i/--interactive to git-clean
  Show items of interactive git-clean in columns
  Add colors to interactive git-clean
  git-clean: use a git-add-interactive compatible UI
  git-clean: interactive cleaning by select numbers
  git-clean: rm -i style interactive cleaning
  git-clean: update document for interactive git-clean

 Documentation/config.txt|   4 +
 Documentation/git-clean.txt |  71 -
 builtin/clean.c | 700 ++--
 3 files changed, 752 insertions(+), 23 deletions(-)

-- 
1.8.3.rc1.338.gb35aa5d

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 1/7] Add support for -i/--interactive to git-clean

2013-05-06 Thread Jiang Xin
Show what would be done and the user must confirm before actually
cleaning. In the confirmation dialog, the user has three choices:

 * y/yes:  Start to do cleaning.
 * n/no:   Nothing will be deleted.
 * e/edit: Exclude items from deletion using ignore patterns.

When the user chooses the edit mode, the user can input space-
separated patterns (the same syntax as gitignore), and each clean
candidate that matches with one of the patterns will be excluded
from cleaning. When the user feels it's OK, presses ENTER and back
to the confirmation dialog.

Signed-off-by: Jiang Xin worldhello@gmail.com
Suggested-by: Junio C Hamano gits...@pobox.com
Spelling-checked-by: Eric Sunshine sunsh...@sunshineco.com
Comments-by: Matthieu Moy matthieu@imag.fr
Suggested-by: Eric Sunshine sunsh...@sunshineco.com
---
 Documentation/git-clean.txt |  15 +++-
 builtin/clean.c | 195 
 2 files changed, 191 insertions(+), 19 deletions(-)

diff --git a/Documentation/git-clean.txt b/Documentation/git-clean.txt
index bdc3a..f5572 100644
--- a/Documentation/git-clean.txt
+++ b/Documentation/git-clean.txt
@@ -8,7 +8,7 @@ git-clean - Remove untracked files from the working tree
 SYNOPSIS
 
 [verse]
-'git clean' [-d] [-f] [-n] [-q] [-e pattern] [-x | -X] [--] path...
+'git clean' [-d] [-f] [-i] [-n] [-q] [-e pattern] [-x | -X] [--] path...
 
 DESCRIPTION
 ---
@@ -34,7 +34,18 @@ OPTIONS
 -f::
 --force::
If the Git configuration variable clean.requireForce is not set
-   to false, 'git clean' will refuse to run unless given -f or -n.
+   to false, 'git clean' will refuse to run unless given -f, -n or
+   -i.
+
+-i::
+--interactive::
+   Show what would be done and the user must confirm before actually
+   cleaning. In the confirmation dialog, the user can choose to abort
+   the cleaning, or enter into an edit mode. In the edit mode, the
+   user can input space-separated patterns (the same syntax as
+   gitignore), and each clean candidate that matches with one of the
+   patterns will be excluded from cleaning. When the user feels it's
+   OK, presses ENTER and back to the confirmation dialog.
 
 -n::
 --dry-run::
diff --git a/builtin/clean.c b/builtin/clean.c
index 04e39..29fbf 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -15,9 +15,12 @@
 #include quote.h
 
 static int force = -1; /* unset */
+static int interactive;
+static struct string_list del_list = STRING_LIST_INIT_DUP;
+static const char **the_prefix;
 
 static const char *const builtin_clean_usage[] = {
-   N_(git clean [-d] [-f] [-n] [-q] [-e pattern] [-x | -X] [--] 
paths...),
+   N_(git clean [-d] [-f] [-i] [-n] [-q] [-e pattern] [-x | -X] [--] 
paths...),
NULL
 };
 
@@ -142,6 +145,139 @@ static int remove_dirs(struct strbuf *path, const char 
*prefix, int force_flag,
return ret;
 }
 
+void edit_by_patterns_cmd()
+{
+   struct dir_struct dir;
+   struct strbuf confirm = STRBUF_INIT;
+   struct strbuf buf = STRBUF_INIT;
+   struct strbuf **ignore_list;
+   struct string_list_item *item;
+   struct exclude_list *el;
+   const char *qname;
+   int changed = -1, i;
+
+   while (1) {
+   /* dels list may become empty when we run 
string_list_remove_empty_items later */
+   if (!del_list.nr) {
+   printf_ln(_(No more files to clean, exiting.));
+   break;
+   }
+
+   if (changed) {
+   putchar('\n');
+
+   /* Display dels in Would remove ... format */
+   for_each_string_list_item(item, del_list) {
+   qname = quote_path_relative(item-string, -1, 
buf, *the_prefix);
+   printf(_(msg_would_remove), qname);
+   }
+   putchar('\n');
+   }
+
+   printf(_(Input ignore patterns ));
+   if (strbuf_getline(confirm, stdin, '\n') != EOF) {
+   strbuf_trim(confirm);
+   } else {
+   putchar('\n');
+   break;
+   }
+
+   /* Quit edit mode */
+   if (!confirm.len)
+   break;
+
+   memset(dir, 0, sizeof(dir));
+   el = add_exclude_list(dir, EXC_CMDL, manual exclude);
+   ignore_list = strbuf_split_max(confirm, ' ', 0);
+
+   for (i = 0; ignore_list[i]; i++) {
+   strbuf_trim(ignore_list[i]);
+   if (!ignore_list[i]-len)
+   continue;
+
+   add_exclude(ignore_list[i]-buf, , 0, el, -(i+1));
+   }
+
+   changed = 0;
+   for_each_string_list_item(item, del_list) {
+   int dtype = DT_UNKNOWN;
+   const 

[PATCH v6 2/7] Show items of interactive git-clean in columns

2013-05-06 Thread Jiang Xin
When there are lots of items to be cleaned, it is hard to see them all
in one screen. Show them in columns instead of in one column will solve
this problem.

Since no longer show items to be cleaned using the Would remove ...
format (only plain filenames) in interactive mode, we add instructions
and warnings as header before them.

Signed-off-by: Jiang Xin worldhello@gmail.com
Comments-by: Matthieu Moy matthieu@imag.fr
---
 Documentation/config.txt |  4 
 builtin/clean.c  | 58 +---
 2 files changed, 44 insertions(+), 18 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 6e53f..98bfa 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -955,6 +955,10 @@ column.branch::
Specify whether to output branch listing in `git branch` in columns.
See `column.ui` for details.
 
+column.clean::
+   Specify whether to output cleaning files in `git clean -i` in columns.
+   See `column.ui` for details.
+
 column.status::
Specify whether to output untracked files in `git status` in columns.
See `column.ui` for details.
diff --git a/builtin/clean.c b/builtin/clean.c
index 29fbf..43383 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -13,11 +13,13 @@
 #include refs.h
 #include string-list.h
 #include quote.h
+#include column.h
 
 static int force = -1; /* unset */
 static int interactive;
 static struct string_list del_list = STRING_LIST_INIT_DUP;
 static const char **the_prefix;
+static unsigned int colopts;
 
 static const char *const builtin_clean_usage[] = {
N_(git clean [-d] [-f] [-i] [-n] [-q] [-e pattern] [-x | -X] [--] 
paths...),
@@ -32,8 +34,13 @@ static const char *msg_warn_remove_failed = N_(failed to 
remove %s);
 
 static int git_clean_config(const char *var, const char *value, void *cb)
 {
-   if (!strcmp(var, clean.requireforce))
+   if (!prefixcmp(var, column.))
+   return git_column_config(var, value, clean, colopts);
+
+   if (!strcmp(var, clean.requireforce)) {
force = !git_config_bool(var, value);
+   return 0;
+   }
return git_default_config(var, value, cb);
 }
 
@@ -145,6 +152,33 @@ static int remove_dirs(struct strbuf *path, const char 
*prefix, int force_flag,
return ret;
 }
 
+void pretty_print_dels()
+{
+   struct string_list list = STRING_LIST_INIT_DUP;
+   struct string_list_item *item;
+   struct strbuf buf = STRBUF_INIT;
+   const char *qname;
+   struct column_options copts;
+
+   for_each_string_list_item(item, del_list) {
+   qname = quote_path_relative(item-string, -1, buf, 
*the_prefix);
+   string_list_append(list, qname);
+   }
+
+   /*
+* always enable column display, we only consult column.*
+* about layout strategy and stuff
+*/
+   colopts = (colopts  ~COL_ENABLE_MASK) | COL_ENABLED;
+   memset(copts, 0, sizeof(copts));
+   copts.indent =   ;
+   copts.padding = 2;
+   print_columns(list, colopts, copts);
+   putchar('\n');
+   strbuf_release(buf);
+   string_list_clear(list, 0);
+}
+
 void edit_by_patterns_cmd()
 {
struct dir_struct dir;
@@ -153,7 +187,6 @@ void edit_by_patterns_cmd()
struct strbuf **ignore_list;
struct string_list_item *item;
struct exclude_list *el;
-   const char *qname;
int changed = -1, i;
 
while (1) {
@@ -166,12 +199,8 @@ void edit_by_patterns_cmd()
if (changed) {
putchar('\n');
 
-   /* Display dels in Would remove ... format */
-   for_each_string_list_item(item, del_list) {
-   qname = quote_path_relative(item-string, -1, 
buf, *the_prefix);
-   printf(_(msg_would_remove), qname);
-   }
-   putchar('\n');
+   /* Display dels in columns */
+   pretty_print_dels();
}
 
printf(_(Input ignore patterns ));
@@ -228,23 +257,17 @@ void edit_by_patterns_cmd()
 void interactive_main_loop()
 {
struct strbuf confirm = STRBUF_INIT;
-   struct strbuf buf = STRBUF_INIT;
-   struct string_list_item *item;
-   const char *qname;
 
/* dels list may become empty after return back from edit mode */
while (del_list.nr) {
+   putchar('\n');
printf_ln(Q_(Would remove the following item:,
 Would remove the following items:,
 del_list.nr));
putchar('\n');
 
-   /* Display dels in Would remove ... format */
-   for_each_string_list_item(item, del_list) {
-   qname = quote_path_relative(item-string, -1, buf, 
*the_prefix);
-   

[PATCH v6 3/7] Add colors to interactive git-clean

2013-05-06 Thread Jiang Xin
Show header, help, error messages, and prompt in colors for interactive
git-clean. Re-use config variables for other git commands, such as
git-add--interactive and git-stash:

 * color.interactive: When set to always, always use colors for
   interactive prompts and displays. When false (or never),
   never. When set to true or auto, use colors only when the
   output is to the terminal.

 * color.interactive.slot: Use customized color for interactive
   git-clean output (like git add --interactive). slot may be
   prompt, header, help or error.

Signed-off-by: Jiang Xin worldhello@gmail.com
Comments-by: Matthieu Moy matthieu@imag.fr
---
 builtin/clean.c | 78 -
 1 file changed, 77 insertions(+), 1 deletion(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index 43383..6bda3 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -14,6 +14,7 @@
 #include string-list.h
 #include quote.h
 #include column.h
+#include color.h
 
 static int force = -1; /* unset */
 static int interactive;
@@ -32,16 +33,81 @@ static const char *msg_skip_git_dir = N_(Skipping 
repository %s\n);
 static const char *msg_would_skip_git_dir = N_(Would skip repository %s\n);
 static const char *msg_warn_remove_failed = N_(failed to remove %s);
 
+static int clean_use_color = -1;
+static char clean_colors[][COLOR_MAXLEN] = {
+   GIT_COLOR_RESET,
+   GIT_COLOR_NORMAL,   /* PLAIN */
+   GIT_COLOR_BOLD_BLUE,/* PROMPT */
+   GIT_COLOR_BOLD, /* HEADER */
+   GIT_COLOR_BOLD_RED, /* HELP */
+   GIT_COLOR_BOLD_RED, /* ERROR */
+};
+enum color_clean {
+   CLEAN_COLOR_RESET = 0,
+   CLEAN_COLOR_PLAIN = 1,
+   CLEAN_COLOR_PROMPT = 2,
+   CLEAN_COLOR_HEADER = 3,
+   CLEAN_COLOR_HELP = 4,
+   CLEAN_COLOR_ERROR = 5,
+};
+
+static int parse_clean_color_slot(const char *var, int ofs)
+{
+   if (!strcasecmp(var+ofs, reset))
+   return CLEAN_COLOR_RESET;
+   if (!strcasecmp(var+ofs, plain))
+   return CLEAN_COLOR_PLAIN;
+   if (!strcasecmp(var+ofs, prompt))
+   return CLEAN_COLOR_PROMPT;
+   if (!strcasecmp(var+ofs, header))
+   return CLEAN_COLOR_HEADER;
+   if (!strcasecmp(var+ofs, help))
+   return CLEAN_COLOR_HELP;
+   if (!strcasecmp(var+ofs, error))
+   return CLEAN_COLOR_ERROR;
+   return -1;
+}
+
 static int git_clean_config(const char *var, const char *value, void *cb)
 {
if (!prefixcmp(var, column.))
return git_column_config(var, value, clean, colopts);
 
+   /* honors the color.interactive* config variables which also
+  applied in git-add--interactive and git-stash */
+   if (!strcmp(var, color.interactive)) {
+   clean_use_color = git_config_colorbool(var, value);
+   return 0;
+   }
+   if (!prefixcmp(var, color.interactive.)) {
+   int slot = parse_clean_color_slot(var, 18);
+   if (slot  0)
+   return 0;
+   if (!value)
+   return config_error_nonbool(var);
+   color_parse(value, var, clean_colors[slot]);
+   return 0;
+   }
+
if (!strcmp(var, clean.requireforce)) {
force = !git_config_bool(var, value);
return 0;
}
-   return git_default_config(var, value, cb);
+
+   /* inspect the color.ui config variable and others */
+   return git_color_default_config(var, value, cb);
+}
+
+static const char *clean_get_color(enum color_clean ix)
+{
+   if (want_color(clean_use_color))
+   return clean_colors[ix];
+   return ;
+}
+
+static void clean_print_color(enum color_clean ix)
+{
+   printf(%s, clean_get_color(ix));
 }
 
 static int exclude_cb(const struct option *opt, const char *arg, int unset)
@@ -192,7 +258,9 @@ void edit_by_patterns_cmd()
while (1) {
/* dels list may become empty when we run 
string_list_remove_empty_items later */
if (!del_list.nr) {
+   clean_print_color(CLEAN_COLOR_ERROR);
printf_ln(_(No more files to clean, exiting.));
+   clean_print_color(CLEAN_COLOR_RESET);
break;
}
 
@@ -203,7 +271,9 @@ void edit_by_patterns_cmd()
pretty_print_dels();
}
 
+   clean_print_color(CLEAN_COLOR_PROMPT);
printf(_(Input ignore patterns ));
+   clean_print_color(CLEAN_COLOR_RESET);
if (strbuf_getline(confirm, stdin, '\n') != EOF) {
strbuf_trim(confirm);
} else {
@@ -243,7 +313,9 @@ void edit_by_patterns_cmd()
if (changed) {
string_list_remove_empty_items(del_list, 0);
} else {
+   

[PATCH v6 4/7] git-clean: use a git-add-interactive compatible UI

2013-05-06 Thread Jiang Xin
Rewrite menu using a new method `list_and_choose`, which is borrowed
from `git-add--interactive.perl`. We can reused this method later for
more actions.

Please NOTE:

 * Method `list_and_choose` return an array of integers, and
 * it is up to you to free the allocated memory of the array.
 * The array ends with EOF.
 * If user pressed CTRL-D (i.e. EOF), no selection returned.

Signed-off-by: Jiang Xin worldhello@gmail.com
---
 builtin/clean.c | 410 ++--
 1 file changed, 367 insertions(+), 43 deletions(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index 6bda3..3b9f3 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -16,6 +16,35 @@
 #include column.h
 #include color.h
 
+#define MENU_OPTS_SINGLETON01
+#define MENU_OPTS_IMMEDIATE02
+#define MENU_OPTS_LIST_ONLY04
+
+#define MENU_RETURN_NO_LOOP10
+
+struct menu_opts {
+   const char *header;
+   const char *prompt;
+   int flag;
+};
+
+enum menu_stuff_type {
+   MENU_STUFF_TYPE_STRING_LIST = 1,
+   MENU_STUFF_TYPE_MENU_ITEM
+};
+
+struct menu_stuff {
+   enum menu_stuff_type type;
+   int nr;
+   void *stuff;
+};
+
+struct menu_item {
+   char hotkey;
+   char *title;
+   int (*fn)();
+};
+
 static int force = -1; /* unset */
 static int interactive;
 static struct string_list del_list = STRING_LIST_INIT_DUP;
@@ -240,12 +269,284 @@ void pretty_print_dels()
copts.indent =   ;
copts.padding = 2;
print_columns(list, colopts, copts);
-   putchar('\n');
strbuf_release(buf);
string_list_clear(list, 0);
 }
 
-void edit_by_patterns_cmd()
+void pretty_print_menus(struct string_list *menu_list)
+{
+   struct strbuf buf = STRBUF_INIT;
+   unsigned int local_colopts = 0;
+   struct column_options copts;
+
+   /*
+* always enable column display, we only consult column.*
+* about layout strategy and stuff
+*/
+   local_colopts = COL_ENABLED | COL_ROW;
+   memset(copts, 0, sizeof(copts));
+   copts.indent =   ;
+   copts.padding = 2;
+   print_columns(menu_list, local_colopts, copts);
+   strbuf_release(buf);
+}
+
+void prompt_help_cmd(int singleton)
+{
+   clean_print_color(CLEAN_COLOR_HELP);
+   printf_ln(singleton ?
+ _(Prompt help:\n
+   1  - select a numbered item\n
+   foo- select item based on unique prefix\n
+  - (empty) select nothing) :
+ _(Prompt help:\n
+   1  - select a single item\n
+   3-5- select a range of items\n
+   2-3,6-9- select multiple ranges\n
+   foo- select item based on unique prefix\n
+   -...   - unselect specified items\n
+   *  - choose all items\n
+  - (empty) finish selecting));
+   clean_print_color(CLEAN_COLOR_RESET);
+}
+
+/*
+ * Implement a git-add-interactive compatible UI, which is borrowed
+ * from git-add--interactive.perl.
+ *
+ * Return value:
+ *
+ *   - Return an array of integers
+ *   - , and it is up to you to free the allocated memory.
+ *   - The array ends with EOF.
+ *   - If user pressed CTRL-D (i.e. EOF), no selection returned.
+ */
+int *list_and_choose(struct menu_opts *opts, struct menu_stuff *stuff)
+{
+   static struct string_list menu_list = STRING_LIST_INIT_DUP;
+   struct strbuf menu = STRBUF_INIT;
+   struct strbuf choice = STRBUF_INIT;
+   struct strbuf **choice_list;
+   int *chosen, *result;
+   char *p;
+   int nr = 0;
+   int i, j;
+   int eof = 0;
+
+   chosen = xmalloc(sizeof(int) * stuff-nr);
+   memset(chosen, 0, sizeof(int) * stuff-nr);
+
+   while (1) {
+   int i = 0, j = 0;
+   string_list_clear(menu_list, 0);
+
+   if (opts-header) {
+   printf_ln(%s%s%s,
+ clean_get_color(CLEAN_COLOR_HEADER),
+ opts-header,
+ clean_get_color(CLEAN_COLOR_RESET));
+   }
+
+   /* highlight hotkey in menu */
+   if (MENU_STUFF_TYPE_MENU_ITEM == stuff-type) {
+   struct menu_item *item;
+
+   item = (struct menu_item *)stuff-stuff;
+   for (i = 0; i  stuff-nr; i++, item++) {
+   p = item-title;
+   strbuf_addf(menu, %s%2d: , chosen[i] ? * : 
 , i+1);
+   for (; *p; p++) {
+   if (*p == item-hotkey) {
+   strbuf_addstr(menu, 
clean_get_color(CLEAN_COLOR_PROMPT));
+   strbuf_addch(menu, 

[PATCH v3 10/9] revision.c: treat A...B merge bases as if manually specified

2013-05-06 Thread Kevin Bracey
The documentation assures users that A...B is defined as 'r1 r2 --not
$(git merge-base --all r1 r2)'. This isn't in fact quite true, because
the calculated merge bases are not sent to add_rev_cmdline().

Previously, the effect was pretty minor - all that I can think of is
that git rev-list --ancestry-path A B ^AB_base works, but git
rev-list --ancestry-path A...B fails with a no bottom commits error.

But now that all history walking cares about bottom commits, this
failure to note the merge bases as bottoms can lead to worse results for
A...B compared to A B ^AB_base.

So ensure that the calculated merge bases are sent to add_rev_cmdline(),
flagged as 'REV_CMD_MERGE_BASE'.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 revision.c   | 9 +++--
 revision.h   | 2 ++
 t/t6111-rev-list-treesame.sh | 4 
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/revision.c b/revision.c
index 99a3432..aad16a6 100644
--- a/revision.c
+++ b/revision.c
@@ -1305,12 +1305,16 @@ void init_revisions(struct rev_info *revs, const char 
*prefix)
 
 static void add_pending_commit_list(struct rev_info *revs,
 struct commit_list *commit_list,
+int whence,
 unsigned int flags)
 {
while (commit_list) {
struct object *object = commit_list-item-object;
+   const char *sha1 = sha1_to_hex(object-sha1);
object-flags |= flags;
-   add_pending_object(revs, object, sha1_to_hex(object-sha1));
+   if (whence != REV_CMD_NONE)
+   add_rev_cmdline(revs, object, sha1, whence, flags);
+   add_pending_object(revs, object, sha1);
commit_list = commit_list-next;
}
 }
@@ -1332,7 +1336,7 @@ static void prepare_show_merge(struct rev_info *revs)
add_pending_object(revs, head-object, HEAD);
add_pending_object(revs, other-object, MERGE_HEAD);
bases = get_merge_bases(head, other, 1);
-   add_pending_commit_list(revs, bases, UNINTERESTING);
+   add_pending_commit_list(revs, bases, REV_CMD_MERGE_BASE, UNINTERESTING);
free_commit_list(bases);
head-object.flags |= SYMMETRIC_LEFT;
 
@@ -1420,6 +1424,7 @@ int handle_revision_arg(const char *arg_, struct rev_info 
*revs, int flags, unsi
if (symmetric) {
exclude = get_merge_bases(a, b, 1);
add_pending_commit_list(revs, exclude,
+   REV_CMD_MERGE_BASE,
flags_exclude);
free_commit_list(exclude);
a_flags = flags | SYMMETRIC_LEFT;
diff --git a/revision.h b/revision.h
index 765630a..b2ac5a3 100644
--- a/revision.h
+++ b/revision.h
@@ -32,10 +32,12 @@ struct rev_cmdline_info {
struct object *item;
const char *name;
enum {
+   REV_CMD_NONE,
REV_CMD_REF,
REV_CMD_PARENTS_ONLY,
REV_CMD_LEFT,
REV_CMD_RIGHT,
+   REV_CMD_MERGE_BASE,
REV_CMD_REV
} whence;
unsigned flags;
diff --git a/t/t6111-rev-list-treesame.sh b/t/t6111-rev-list-treesame.sh
index 689d357..ded0b1a 100755
--- a/t/t6111-rev-list-treesame.sh
+++ b/t/t6111-rev-list-treesame.sh
@@ -161,6 +161,10 @@ check_result 'F' B..F --ancestry-path --simplify-merges -- 
file
 check_result 'F D' B..F --first-parent
 check_result 'F' B..F --first-parent -- file
 
+# E...F should be equivalent to E F ^B, and be able to drop D as above.
+check_result 'F' E F ^B -- file
+check_result 'F' E...F -- file
+
 # Any sort of full history of C..F should show D, as it's the connection to C,
 # and it differs from it.
 check_result 'F D B' C..F
-- 
1.8.3.rc0.28.g4b02ef5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 5/7] git-clean: interactive cleaning by select numbers

2013-05-06 Thread Jiang Xin
Draw a multiple choice menu using `list_and_choose` to select items
to be deleted by numbers.

User can input:

 *  1,5-7 : select 1,5,6,7 items to be deleted
 *  * : select all items to be deleted
 *  -*: unselect all, nothing will be deleted
 *: (empty) finish selecting, and return back to main menu

Signed-off-by: Jiang Xin worldhello@gmail.com
---
 builtin/clean.c | 39 +++
 1 file changed, 39 insertions(+)

diff --git a/builtin/clean.c b/builtin/clean.c
index 3b9f3..3b07f 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -622,6 +622,43 @@ int edit_by_patterns_cmd()
return 0;
 }
 
+int edit_by_numbers_cmd()
+{
+   struct menu_opts menu_opts;
+   struct menu_stuff menu_stuff;
+   struct string_list_item *items;
+   int *chosen;
+   int i, j;
+
+   menu_opts.header = NULL;
+   menu_opts.prompt = Select items to delete;
+   menu_opts.flag = 0;
+
+   menu_stuff.type = MENU_STUFF_TYPE_STRING_LIST;
+   menu_stuff.stuff = del_list;
+   menu_stuff.nr = del_list.nr;
+
+   chosen = list_and_choose(menu_opts, menu_stuff);
+   items = del_list.items;
+   for(i = 0, j = 0; i  del_list.nr; i++) {
+   if (i  chosen[j]) {
+   *(items[i].string) = '\0';
+   } else if (i == chosen[j]) {
+   /* delete selected item */
+   j++;
+   continue;
+   } else {
+   /* end of chosen (EOF), won't delete */
+   *(items[i].string) = '\0';
+   }
+   }
+
+   string_list_remove_empty_items(del_list, 0);
+
+   free(chosen);
+   return 0;
+}
+
 int quit_cmd()
 {
string_list_clear(del_list, 0);
@@ -635,6 +672,7 @@ int help_cmd(int x)
printf_ln(_(
clean- start cleaning\n
edit by patterns - exclude items from deletion\n
+   edit by numbers  - select items to be deleted by numbers\n
quit - stop cleaning\n
help - this screen\n
?- help for prompt selection
@@ -652,6 +690,7 @@ void interactive_main_loop()
struct menu_item menus[] = {
{'c', clean,  clean_cmd},
{'p', edit by patterns,   edit_by_patterns_cmd},
+   {'n', edit by numbers,edit_by_numbers_cmd},
{'q', quit,   quit_cmd},
{'h', help,   help_cmd},
};
-- 
1.8.3.rc1.338.gb35aa5d

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 6/7] git-clean: rm -i style interactive cleaning

2013-05-06 Thread Jiang Xin
Add a rm -i style interactive cleaning method. User must confirm one
by one before starting to delete.

Signed-off-by: Jiang Xin worldhello@gmail.com
---
 builtin/clean.c | 36 
 1 file changed, 36 insertions(+)

diff --git a/builtin/clean.c b/builtin/clean.c
index 3b07f..f36ad 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -659,6 +659,40 @@ int edit_by_numbers_cmd()
return 0;
 }
 
+int rm_i_cmd()
+{
+   struct strbuf confirm = STRBUF_INIT;
+   struct strbuf buf = STRBUF_INIT;
+   struct string_list_item *item;
+   const char *qname;
+   int changed = 0, eof = 0;
+
+   for_each_string_list_item(item, del_list) {
+   /* Ctrl-D should stop removing files */
+   if (!eof) {
+   qname = quote_path_relative(item-string, -1, buf, 
*the_prefix);
+   printf(_(remove %s ? ), qname);
+   if (strbuf_getline(confirm, stdin, '\n') != EOF) {
+   strbuf_trim(confirm);
+   } else {
+   putchar('\n');
+   eof = 1;
+   }
+   }
+   if (!confirm.len || !strncasecmp(confirm.buf, no, 
confirm.len)) {
+   *item-string = '\0';
+   changed++;
+   }
+   }
+
+   if (changed)
+   string_list_remove_empty_items(del_list, 0);
+
+   strbuf_release(buf);
+   strbuf_release(confirm);
+   return MENU_RETURN_NO_LOOP;
+}
+
 int quit_cmd()
 {
string_list_clear(del_list, 0);
@@ -673,6 +707,7 @@ int help_cmd(int x)
clean- start cleaning\n
edit by patterns - exclude items from deletion\n
edit by numbers  - select items to be deleted by numbers\n
+   rm -i- delete items one by one, like \rm 
-i\\n
quit - stop cleaning\n
help - this screen\n
?- help for prompt selection
@@ -691,6 +726,7 @@ void interactive_main_loop()
{'c', clean,  clean_cmd},
{'p', edit by patterns,   edit_by_patterns_cmd},
{'n', edit by numbers,edit_by_numbers_cmd},
+   {'i', rm -i,  rm_i_cmd},
{'q', quit,   quit_cmd},
{'h', help,   help_cmd},
};
-- 
1.8.3.rc1.338.gb35aa5d

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 7/7] git-clean: update document for interactive git-clean

2013-05-06 Thread Jiang Xin
Signed-off-by: Jiang Xin worldhello@gmail.com
---
 Documentation/git-clean.txt | 70 -
 1 file changed, 63 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-clean.txt b/Documentation/git-clean.txt
index f5572..56d60 100644
--- a/Documentation/git-clean.txt
+++ b/Documentation/git-clean.txt
@@ -39,13 +39,8 @@ OPTIONS
 
 -i::
 --interactive::
-   Show what would be done and the user must confirm before actually
-   cleaning. In the confirmation dialog, the user can choose to abort
-   the cleaning, or enter into an edit mode. In the edit mode, the
-   user can input space-separated patterns (the same syntax as
-   gitignore), and each clean candidate that matches with one of the
-   patterns will be excluded from cleaning. When the user feels it's
-   OK, presses ENTER and back to the confirmation dialog.
+   Show what would be done and clean files interactively. See
+   ``Interactive mode`` for details.
 
 -n::
 --dry-run::
@@ -74,6 +69,67 @@ OPTIONS
Remove only files ignored by Git.  This may be useful to rebuild
everything from scratch, but keep manually created files.
 
+Interactive mode
+
+When the command enters the interactive mode, it shows the
+files and directories to be cleaned, and goes into its
+interactive command loop.
+
+The command loop shows the list of subcommands available, and
+gives a prompt What now .  In general, when the prompt ends
+with a single '', you can pick only one of the choices given
+and type return, like this:
+
+
+*** Commands ***
+  1: clean 2: edit by patterns   3: edit by numbers
+  4. rm -i 5. quit   6. help
+What now 2
+
+
+You also could say `c` or `clean` above as long as the choice is unique.
+
+The main command loop has 6 subcommands.
+
+clean::
+
+   Start cleaning files and directories, and then quit.
+
+edit by patterns::
+
+   This shows the files and directories to be deleted and issues an
+   Input ignore patterns prompt. You can input a space-seperated
+   patterns to exclude files and directories from deletion.
+   E.g. *.c *.h will excludes files end with .c and .h from
+   deletion. When you are satisfied with the filtered result, press
+   ENTER (empty) back to the main menu.
+
+edit by numbers::
+
+   This shows the files and directories to be deleted and issues an
+   Select items to delete prompt. When the prompt ends with double
+   '' like this, you can make more than one selection, concatenated
+   with whitespace or comma.  Also you can say ranges.  E.g. 2-5 7,9
+   to choose 2,3,4,5,7,9 from the list.  If the second number in a
+   range is omitted, all remaining patches are taken.  E.g. 7- to
+   choose 7,8,9 from the list.  You can say '*' to choose everything.
+   Also when you are satisfied with the filtered result, press ENTER
+   (empty) back to the main menu.
+
+rm -i::
+
+  This will show a rm -i style cleaning, that you must confirm one
+  by one in order to delete items. This action is not as efficient
+  as the above two actions.
+
+quit::
+
+  This lets you quit without do cleaning.
+
+help::
+
+  Show brief usage of interactive git-clean.
+
 SEE ALSO
 
 linkgit:gitignore[5]
-- 
1.8.3.rc1.338.gb35aa5d

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] l10n: de.po: translate 44 new messages

2013-05-06 Thread Thomas Rast
Ralf Thielow ralf.thie...@gmail.com writes:

  #: branch.c:275
 -#, fuzzy, c-format
 +#, c-format
  msgid Not a valid branch point: '%s'.
 -msgstr Ungültiger Zweig-Name: '%s'
 +msgstr Ungültige Zweig-Version: '%s'

That's a bit strange and calls for a proper translation of branch
point.  Abzweigung is really tempting, but probably does nothing to
clarify the intent ;-)

Perhaps Abzweigpunkt.  dict.leo.org has some other reasonable ideas.

  #: builtin/checkout.c:1065
  msgid do not limit pathspecs to sparse entries only
 -msgstr 
 +msgstr keine Beschränkung von Pfadspezifikationen zu eingeschränkten 
 +Einträgen

Huh?! :-)

Even if you translated sparse entry as eingeschränkte Einträge -- it
doesn't have a wiki entry -- using Beschränkung in the same sentence
is just confusing.

How about

  Pfadspezifikationen ignorieren Einstellungen zum partiellen Auschecken


All the ones I didn't quote here: Acked-by: Thomas Rast tr...@inf.ethz.ch

Thanks for your work!

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 3/9] t6111: new TREESAME test set

2013-05-06 Thread Junio C Hamano
Kevin Bracey ke...@bracey.fi writes:

 Some side branching and odd merging to illustrate various flaws in
 revision list scans, particularly when limiting the list.

 Many expected failures, which will be gone by the end of the history
 traversal refinements series.

 Signed-off-by: Kevin Bracey ke...@bracey.fi
 ---
  t/t6111-rev-list-treesame.sh | 180 
 +++
  1 file changed, 180 insertions(+)
  create mode 100755 t/t6111-rev-list-treesame.sh

 diff --git a/t/t6111-rev-list-treesame.sh b/t/t6111-rev-list-treesame.sh
 new file mode 100755
 index 000..602c02d
 --- /dev/null
 +++ b/t/t6111-rev-list-treesame.sh
 @@ -0,0 +1,180 @@
 +#!/bin/sh
 +#
 +#,---E--.   *H--. * marks !TREESAME parent paths
 +#   /\ / \*
 +# *A--*B---D--*F-*G-K-*L-*M
 +#   \ /*   \   /
 +#`-C-'  `-*I-*J
 +#
 +# A creates file, B and F change it.
 +# Odd merge G takes the old version from B.
 +# I changes it, but J reverts it.
 +# H and L both change it, and M merges those changes.

... and because J is a revert of I, K ends up merging the same and
being the same with both G and J.

 +test_description='TREESAME and limiting'
 +
 +. ./test-lib.sh
 +
 +note () {
 + git tag $1
 +}
 +
 +unnote () {
 + git name-rev --tags --stdin | sed -e s|$_x40 (tags/\([^)]*\)) |\1 |g
 +}
 +
 +test_expect_success setup '
 + test_commit Initial file file Hi there A 
 + git branch other-branch 
 +
 + test_commit file=Hello file Hello B 
 + git branch third-branch 
 +
 + git checkout other-branch 
 + test_commit Added other other Hello C 
 +
 + git checkout master 
 + test_merge D other-branch 
 +
 + git checkout third-branch 
 + test_commit Third file third Nothing E 
 +
 + git checkout master 
 + test_commit file=Blah file Blah F 
 +
 + test_tick  git merge --no-commit third-branch 
 + git checkout third-branch file 
 + git commit 
 + note G 
 + git branch fiddler-branch 
 +
 + git checkout -b part2-branch 
 + test_commit file=Part 2 file Part 2 H 
 +
 + git checkout fiddler-branch 
 + test_commit Bad commit file Silly I 
 +
 + test_tick  git revert I  note J 
 +
 + git checkout master 
 + test_tick  git merge --no-ff fiddler-branch 
 + note K
 +
 + test_commit file=Part 1 file Part 1 L 
 +
 + test_tick  test_must_fail git merge part2-branch 
 + test_commit M file Parts 1+2
 +'
 +
 +FMT='tformat:%P  %H | %s'
 +
 +# could we soup this up to optionally check parents? So (BA)C would check
 +# that C is shown and has parents B A.

Sounds like a good thing to do, especially given that some of the
earlier issues you brought up triggered only when --parents is not
used, so...

Perhaps something like the attached.

 +check_outcome () {
 + outcome=$1
 + shift
 + for c in $1
 + do
 + echo $c
 + done expect 

Maybe

printf %s\n $1 expect

is more fashionable these days?

 +# Odd merge G drops a change in F. Important that G is listed in all
 +# except the most basic list. Achieving this means normal merge D will also 
 be
 +# shown in normal full-history, as we can't distinguish unless we do a
 +# simplification pass. After simplification, D is dropped but G remains.
 +check_result 'M L K J I H G F E D C B A'

I'll read the expectation and comment on them in a separate message.

 t/t6111-rev-list-treesame.sh | 30 +-
 1 file changed, 21 insertions(+), 9 deletions(-)

diff --git a/t/t6111-rev-list-treesame.sh b/t/t6111-rev-list-treesame.sh
index 602c02d..f4c9cac 100755
--- a/t/t6111-rev-list-treesame.sh
+++ b/t/t6111-rev-list-treesame.sh
@@ -20,7 +20,7 @@ note () {
 }
 
 unnote () {
-   git name-rev --tags --stdin | sed -e s|$_x40 (tags/\([^)]*\)) |\1 |g
+   git name-rev --tags --stdin | sed -e s|$_x40 (tags/\([^)]*\))\([   
]\)|\1\2|g
 }
 
 test_expect_success setup '
@@ -66,23 +66,34 @@ test_expect_success setup '
test_commit M file Parts 1+2
 '
 
-FMT='tformat:%P%H | %s'
-
 # could we soup this up to optionally check parents? So (BA)C would check
 # that C is shown and has parents B A.
 check_outcome () {
outcome=$1
shift
-   for c in $1
-   do
-   echo $c
-   done expect 
-   shift 
+
+   case $1 in
+   *(*)
+   FMT=%P %H | %s
+   munge_actual=
+   s/^\([^ ]*\)\([^ ]*\) .*/(\1)\2/
+   s/ //g
+   s/()//
+   
+   ;;
+   *)
+   FMT=%H | %s
+   munge_actual=s/^\([^ ]*\) .*/\1/
+   ;;
+   esac 
+   printf %s\n $1 expect 
+   shift
+
param=$* 
test_expect_$outcome log $param '
git log --format=$FMT $param |
unnote actual 
-   sed -e s/^.*   \([^ ]*\) .*/\1/ 

Re: git rev-list | git cherry-pick --stdin is leaky

2013-05-06 Thread Stephen Boyd
On 04/30/13 15:47, René Scharfe wrote:
 Am 30.04.2013 02:11, schrieb Stephen Boyd:
 (resending since the attachment seems to make vger sad)

 Hi,

 I'm running git rev-list | git cherry-pick --stdin on a range of about
 300 commits. Eventually the chery-pick dies with:

  error: cannot fork() for commit: Cannot allocate memory

 Running valgrind shows me that the tree traversal code is leaking
 gigabytes of memory (particularly unpack_callback). Since cherry-pick is
 a very long running process all these allocations are never freed and
 eventually I run out of memory. The worst offender and summary is:

 ==7986== 938,956,692 (929,961,582 direct, 8,995,110 indirect) bytes in
 7,765,439 blocks are definitely lost in loss record 257 of 257
 ==7986==at 0x4C267CC: calloc (vg_replace_malloc.c:467)
 ==7986==by 0x4FAF57: xcalloc (wrapper.c:119)
 ==7986==by 0x4F5281: unpack_callback (unpack-trees.c:539)
 ==7986==by 0x4F40E5: traverse_trees (tree-walk.c:407)
 ==7986==by 0x4F586C: unpack_callback (unpack-trees.c:467)
 ==7986==by 0x4F40E5: traverse_trees (tree-walk.c:407)
 ==7986==by 0x4F586C: unpack_callback (unpack-trees.c:467)
 ==7986==by 0x4F40E5: traverse_trees (tree-walk.c:407)
 ==7986==by 0x4F586C: unpack_callback (unpack-trees.c:467)
 ==7986==by 0x4F40E5: traverse_trees (tree-walk.c:407)
 ==7986==by 0x4F586C: unpack_callback (unpack-trees.c:467)
 ==7986==by 0x4F40E5: traverse_trees (tree-walk.c:407)
 ==7986==
 ==7986== LEAK SUMMARY:
 ==7986==definitely lost: 2,514,117,692 bytes in 21,210,861 blocks
 ==7986==indirectly lost: 885,481,947 bytes in 10,165,801 blocks
 ==7986==  possibly lost: 650,712,395 bytes in 6,014,309 blocks
 ==7986==still reachable: 7,734,870 bytes in 47,794 blocks
 ==7986== suppressed: 0 bytes in 0 blocks
 I looked at that particular leak a year ago but couldn't convince myself
 to submit the patch below.  If the callback function we call through
 call_unpack_fn does something strange like free()ing entries itself or
 adding them to some list without duplication then the added free() can
 cause trouble.

 Looking at it again today I don't understand that concern any more.  The
 current callback functions don't do something like that, in any case.
 Maybe I'm missing something.

 Anyway, could you please check if the patch helps with your use case?

Ok. I tested it and it definitely helps.

==10728== LEAK SUMMARY:
==10728==definitely lost: 316,355,458 bytes in 8,652 blocks
==10728==indirectly lost: 1,327,251,588 bytes in 16,180,628 blocks
==10728==  possibly lost: 677,049,918 bytes in 7,381,801 blocks
==10728==still reachable: 9,238,039 bytes in 63,947 blocks
==10728== suppressed: 0 bytes in 0 blocks

vs.

==27614== LEAK SUMMARY:
==27614==definitely lost: 2,369,692,222 bytes in 20,005,707 blocks
==27614==indirectly lost: 829,151,786 bytes in 9,594,715 blocks
==27614==  possibly lost: 658,069,373 bytes in 6,345,172 blocks
==27614==still reachable: 8,806,386 bytes in 50,199 blocks
==27614== suppressed: 0 bytes in 0 blocks

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 3/9] t6111: new TREESAME test set

2013-05-06 Thread Junio C Hamano
Kevin Bracey ke...@bracey.fi writes:

 +#,---E--.   *H--. * marks !TREESAME parent paths
 +#   /\ / \*
 +# *A--*B---D--*F-*G-K-*L-*M
 +#   \ /*   \   /
 +#`-C-'  `-*I-*J
 +#
 +# A creates file, B and F change it.
 +# Odd merge G takes the old version from B.
 +# I changes it, but J reverts it.
 +# H and L both change it, and M merges those changes.
 + ...
 +# Odd merge G drops a change in F. Important that G is listed in all
 +# except the most basic list. Achieving this means normal merge D will also 
 be
 +# shown in normal full-history, as we can't distinguish unless we do a
 +# simplification pass. After simplification, D is dropped but G remains.
 +check_result 'M L K J I H G F E D C B A'

OK.

 +check_result 'M H L K J I G E F D C B A' --topo-order

OK.

 +check_result 'M L H B A' -- file

OK.

 +check_result 'M L H B A' --parents -- file

OK.

 +check_outcome failure 'M L J I H G F D B A' --full-history -- file # drops G

This breaks the promise that full history at least gives a connected
graph.

 +check_result 'M L K J I H G F D B A' --full-history --parents -- file

And it is saved by --parents; the (parents)commit notation may
make it easier to read the above two tests.

 +check_outcome failure 'M H L J I G F B A' --simplify-merges -- file # drops G

The hsitory simplify merges gives ought to be a full history that
keeps any commit that locally touches the specified path, so keeping
I and J even though they do not contribute to the end result is
correct.  The path in question is the same between G and B, but
again, F should be shown because it should be in the full history.

Even though E simplifies down to B, making G a merge between B and F
(the former is an ancestor of the latter), G is kept, because...?

Because the path at B and F are different?

I have to wonder what should happen if D makes path different from B,
and then F makes path the same as B.  B and F should still be kept.

So because the path at B and F are different is not the reason why
we keep G.  How do we decide why G must be kept?

 +check_result 'M L K G F D B A' --first-parent
 +check_result 'M L G F B A' --first-parent -- file

OK.

 +# Check that odd merge G remains shown when F is the bottom.

I did not have a chance to say this when you responded in the
previous round with that bottom thing before you sent this new
round, but I am not sure if it is a good idea to pay special
attention to the bottom-ness.  Very often, instead of running git
log -- path and stopping when seeing enough by killing the pager,
people run git log HEAD~200.. -- path to pick recent enough
commits, but the number 200 is very much arbitrary.

 +check_result 'M L K J I H G E' F..M

OK.

 +check_result 'M H L K J I G E' F..M --topo-order

OK.  But a note about style: options should come before revision
ranges.

 +check_result 'M L H' F..M -- file

OK.  It is still one possible history to explain what we have in the
path at the end.

 +check_result 'M L H' F..M --parents -- file # L+H's parents rewritten to B, 
 so more useful than it may seem

OK.

 +check_outcome failure 'M L J I H G' F..M --full-history -- file # drops G

G is same as E which is in the graph (F and B are the boundaries of
the history, and F is no special than B in that regard).  You want
to show a merge that is different from at least one parent (as
opposed to showing only when it is different from all parents), so
you want to show G.  OK.

 +check_result 'M L K J I H G' F..M --full-history --parents -- file

OK.

It almost makes me suspect that the --full-history option should
flip revs-rewrite_parents on (but not revs-print_parents, unless
the --parents option asks us to) when the option is parsed.

 +check_outcome failure 'M H L J I G' F..M --simplify-merges -- file # drops G

The same as the --full-history without --parents before.

 +check_result 'M L K J I H G' F..M --ancestry-path
 +check_outcome failure 'M L J I H G' F..M --ancestry-path -- file # drops G

With --ancestry-path, what G does relative to E becomes irrelevant,
because E is not in the F..M graph.  G changes path relative to its
remaining parent F, so it should be shown.  OK.

 +check_result 'M L K J I H G' F..M --ancestry-path --parents -- file

OK.  Again this makes me suspect --ancestry-path should flip
revs-rewrite_parents on.

 +check_result 'M H L J I G' F..M --ancestry-path --simplify-merges -- file

 +check_result 'M L K G' F..M --first-parent
 +check_result 'M L G' F..M --first-parent -- file
 +
 +# Note that G is pruned when E is the bottom, even if it's the same commit 
 list
 +# If we want history since E, then we're quite happy to ignore G that took E.
 +check_result 'M L K J I H G' E..M --ancestry-path

OK.

 +check_result 'M L J I H' E..M --ancestry-path -- file

Because F is outside our graph with --ancestry-path, G does not
bring anything new to path relative to its remaining parent E and
should not be shown.  

Re: [PATCH v3 5/9] revision.c: Make --full-history consider more merges

2013-05-06 Thread Junio C Hamano
Kevin Bracey ke...@bracey.fi writes:

 diff --git a/revision.c b/revision.c
 index a67b615..c88ded8 100644
 --- a/revision.c
 +++ b/revision.c
 @@ -429,10 +429,100 @@ static int rev_same_tree_as_empty(struct rev_info 
 *revs, struct commit *commit)
   return retval = 0  (tree_difference == REV_TREE_SAME);
  }
  
 +struct treesame_state {
 + unsigned int nparents;
 + unsigned char treesame[FLEX_ARRAY];
 +};

I have been wondering if we want to do one-bit (not one-byte) per
parent but no biggie ;-)

 @@ -1971,6 +2083,70 @@ static struct merge_simplify_state 
 *locate_simplify_state(struct rev_info *revs,
   return st;
  }
  
 +static int mark_redundant_parents(struct rev_info *revs, struct commit 
 *commit)
 +{
 +...
 + po=po-next;

po = po-next;

 + }

This seems to be identical (modulo tests) from the previous round,
which I found a reasonable thing to do.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] fast-export: improve speed by skipping blobs

2013-05-06 Thread Junio C Hamano
Felipe Contreras felipe.contre...@gmail.com writes:

 The story is different on the fast-import side, where we do say we
 dump the full table and a later run can depend on these marks.

 Yes, and gaining nothing but increased disk-space.

I thought that the gaining nothing has already been refuted by the
discussion several hours ago...

cf. http://thread.gmane.org/gmane.comp.version-control.git/223275/focus=223440

Puzzled...

 By discarding marks on blobs, we may be robbing some optimization
 possibilities, and by discarding marks on tags, we may be robbing
 some features, from users of fast-export; we might want to add an
 option --use-object-marks={blob,commit,tag} or something to both
 fast-export and fast-import, so that the former can optionally write
 marks for non-commits out, and the latter can omit non commit marks
 if the user do not need them. But that is a separate issue.

 How?

 * if we teach fast-import to optionally not write marks for blobs
   and trees out, your remote-bzr can take advantage of it, because
   it does not reuse marks for non-commits in later runs, right?
   Existing users like cvs2git that do not ask to skip marks for
   non-commits will not be hurt and keep referring to blobs an
   earlier run wrote out.

 * if we teach fast-export to optionally write marks for blobs and
   trees out, the users of fast-export could reuse marks for blobs
   and trees in later runs (perhaps they can drive fast-export from
   the output of git log --raw, noticing blob object names they
   already saw).  Existing users that do not ask for such a feature
   will not be hurt.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] fast-import: only store commit objects

2013-05-06 Thread Felipe Contreras
On Mon, May 6, 2013 at 5:45 AM, Michael Haggerty mhag...@alum.mit.edu wrote:
 On 05/06/2013 12:32 PM, Thomas Rast wrote:
 Michael Haggerty mhag...@alum.mit.edu writes:

 On 05/03/2013 08:23 PM, Felipe Contreras wrote:
 On Fri, May 3, 2013 at 12:56 PM, Thomas Rast tr...@inf.ethz.ch wrote:
 Felipe Contreras felipe.contre...@gmail.com writes:

 How do we know that this doesn't break any users of fast-import?  Your
 comment isn't very reassuring:

 the vast majority of them will never be used again

 So what's with the minority?

 Actually I don't think there's any minority. If the client program
 doesn't store blobs, the blob marks are not used anyway. So there's no
 change.

 I haven't been following this conversation in detail, but your proposed
 change sounds like something that would break cvs2git [1].  Let me
 explain what cvs2git does and why:

 CVS stores all of the revisions of a single file in a single filename,v
 file in rcsfile(5) format.  The revisions are stored as deltas ordered
 so that a single revision can be reconstructed from a single serial read
 of the file.

 cvs2git reads each of these files once, reconstructing *all* of the
 revisions for a file in a single go.  It then pours them into a
 git-fast-import stream as blobs and sets a mark on each blob.

 Only much later in the conversion does it have enough information to
 reconstruct tree-wide commits.  At that time it outputs git-fast-import
 data (to a second file) defining the git commits and their ancestry.
 The contents are defined by referring to the marks of blobs from the
 first git-fast-import stream file.

 This strategy speeds up the conversion *enormously*.

 So if I understand correctly that you are proposing to stop allowing
 marks on blob objects to be set and/or referred to later, then I object
 vociferously.

 The proposed patch wants to stop writing marks (in --export-marks) for
 anything but commits.  Does cvs2git depend on that?  I.e., are you using
 two separate fast-import processes for the blob and tree/commit phases
 you describe above?

 Yes, it can be handy to start loading the first blobfile in parallel
 with the later stages of the conversion, before the second dumpfile is
 ready.  In that case the user needs to pass --export-marks to the first
 fast-import process to export marks on blobs so that the marks can be
 passed to the second fast-import via --import-marks.

It can be used that way, but it doesn't have to be.

 So the proposed change would break a documented use of cvs2git.

It's documented as an alternative. How many people actually use this
form over the other? Is there really any advantage? It's possibly that
basically nobody is using this form, and there's no benefits.

 Making the export of blob marks optional would of course be OK, as long
 as the default is to export them.

Nobody benefits from leaving the default as it is.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] fast-import: only store commit objects

2013-05-06 Thread Felipe Contreras
On Mon, May 6, 2013 at 7:20 AM, Johannes Schindelin
johannes.schinde...@gmx.de wrote:
 Hi Thomas,

 On Mon, 6 May 2013, Thomas Rast wrote:

 The proposed patch wants to stop writing marks (in --export-marks) for
 anything but commits.

 Then it should not go in.

If that rationale was valid, no change in behavior should ever go in.
Of course, that is not the case, we want to move forward, so changes
in behavior do happen.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] fast-import: only store commit objects

2013-05-06 Thread Felipe Contreras
On Mon, May 6, 2013 at 10:18 AM, Junio C Hamano gits...@pobox.com wrote:
 Michael Haggerty mhag...@alum.mit.edu writes:

 Yes, it can be handy to start loading the first blobfile in parallel
 with the later stages of the conversion, before the second dumpfile is
 ready.  In that case the user needs to pass --export-marks to the first
 fast-import process to export marks on blobs so that the marks can be
 passed to the second fast-import via --import-marks.

 So the proposed change would break a documented use of cvs2git.

 Making the export of blob marks optional would of course be OK, as long
 as the default is to export them.

 Thanks for a concise summary.  Your use case fits exactly what
 Felipe conjectured as the nonexistent minority.

Not true. cvs2git does *not* rely on the blobs being stored in a marks
file, because cvs2git does not rely on mark files at all.

 An option that lets the caller say I only care about marks on these
 types of objects to be written to (and read from) the exported marks
 file would help Felipe's use case without harming your use case,
 and would be a sane and safe way to go.

His case is not harmed at all. It's only the unfortunate command that
is mentioned in the documentation that didn't need to be mentioned at
all in the first place.

It should be the other way around, if it's only this documentation
that is affected, we could add a switch for that particular command,
and the documentation should be updated, but it's overkill to add a
switch for one odd command in some documentation somewhere, it would
be much better to update the odd command to avoid using marks at all,
which is what the more appropriate command does, right below in the
same documentation.

  cat ../cvs2svn-tmp/git-blob.dat ../cvs2svn-tmp/git-dump.dat | git fast-import

Should the rest of the real world be punished because somebody added a
command in some documentation somewhere, which wasn't actually needed
in the first place?

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 10/9] revision.c: treat A...B merge bases as if manually specified

2013-05-06 Thread Junio C Hamano
Kevin Bracey ke...@bracey.fi writes:

 The documentation assures users that A...B is defined as 'r1 r2 --not
 $(git merge-base --all r1 r2)'. This isn't in fact quite true, because
 the calculated merge bases are not sent to add_rev_cmdline().

We want the commands to be able to tell which ones in revs-pending
and revs-commits were specified by the end user and how.  While I
think it makes sense to mark these negative merge bases with These
came from the command line with A...B syntax, I am not sure if it
is the best way to do so in add_pending_commit_list().

By the way, why does this have anything to do with the history
traversal series in the first place?

When there is anythning marked UNINTERESTING on the rev-pending
before calling prepare_revision_walk(), you have a history with some
bottom boundaries, and when there isn't, your bottom boundaries are
root commits.  If you want to behave differently depending on how
the user gave us the revision range from the command line, e.g.
acting differently between A ^B and B..A, cmdline is a good
place to learn the exact form, but at the history traversal level, I
do not think you should even care.  Why does the code even look at
the cmdline, and not rev-pending?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] fast-import: only store commit objects

2013-05-06 Thread Felipe Contreras
On Mon, May 6, 2013 at 4:19 PM, Felipe Contreras
felipe.contre...@gmail.com wrote:
 On Mon, May 6, 2013 at 10:18 AM, Junio C Hamano gits...@pobox.com wrote:
 Michael Haggerty mhag...@alum.mit.edu writes:

 Yes, it can be handy to start loading the first blobfile in parallel
 with the later stages of the conversion, before the second dumpfile is
 ready.  In that case the user needs to pass --export-marks to the first
 fast-import process to export marks on blobs so that the marks can be
 passed to the second fast-import via --import-marks.

 So the proposed change would break a documented use of cvs2git.

 Making the export of blob marks optional would of course be OK, as long
 as the default is to export them.

 Thanks for a concise summary.  Your use case fits exactly what
 Felipe conjectured as the nonexistent minority.

 Not true. cvs2git does *not* rely on the blobs being stored in a marks
 file, because cvs2git does not rely on mark files at all.

 An option that lets the caller say I only care about marks on these
 types of objects to be written to (and read from) the exported marks
 file would help Felipe's use case without harming your use case,
 and would be a sane and safe way to go.

 His case is not harmed at all. It's only the unfortunate command that
 is mentioned in the documentation that didn't need to be mentioned at
 all in the first place.

 It should be the other way around, if it's only this documentation
 that is affected, we could add a switch for that particular command,
 and the documentation should be updated, but it's overkill to add a
 switch for one odd command in some documentation somewhere, it would
 be much better to update the odd command to avoid using marks at all,
 which is what the more appropriate command does, right below in the
 same documentation.

This would simplify the documentation, and obliterate the need to use
mark files at all:

diff -ur cvs2svn-2.4.0/www/cvs2git.html cvs2svn-2.4.0-mod/www/cvs2git.html
--- cvs2svn-2.4.0/www/cvs2git.html  2012-09-22 01:49:55.0 -0500
+++ cvs2svn-2.4.0-mod/www/cvs2git.html  2013-05-06 16:33:12.070189985 -0500
@@ -355,14 +355,13 @@
   fast-import/tt:/p

 pre
-git fast-import --export-marks=../cvs2svn-tmp/git-marks.dat lt;
../cvs2svn-tmp/git-blob.dat
-git fast-import --import-marks=../cvs2svn-tmp/git-marks.dat lt;
../cvs2svn-tmp/git-dump.dat
+cat ../cvs2svn-tmp/git-blob.dat ../cvs2svn-tmp/git-dump.dat | git fast-import
 /pre

-pOn Linux/Unix this can be shortened to:/p
+pOn Windows you should use type instead:/p

 pre
-cat ../cvs2svn-tmp/git-blob.dat ../cvs2svn-tmp/git-dump.dat | git fast-import
+type ../cvs2svn-tmp/git-blob.dat ../cvs2svn-tmp/git-dump.dat | git fast-import
 /pre

   /li
Only in cvs2svn-2.4.0-mod/www: .cvs2git.html.swp


-- 
Felipe Contreras
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: another packed-refs race

2013-05-06 Thread Jeff King
On Mon, May 06, 2013 at 02:41:22PM -0400, Jeff King wrote:

 That is a weaker guarantee, and I think we can provide it with:
 
   1. Load all loose refs into cache for a particular enumeration.
 
   2. Make sure the packed-refs cache is up-to-date (by checking its
  stat() information and reloading if necessary).
 
   3. Run the usual iteration over the loose/packed ref caches.

This does seem to work in my experiments. With stock git, I can trigger
the race reliably with:

  # base load, in one terminal, as before
  git init -q repo 
  cd repo 
  git commit -q --allow-empty -m one 
  one=`git rev-parse HEAD` 
  git commit -q --allow-empty -m two 
  two=`git rev-parse HEAD` 
  sha1=$one 
  while true; do
# this re-creates the loose ref in .git/refs/heads/master
if test $sha1 = $one; then
  sha1=$two
else
  sha1=$one
fi 
git update-ref refs/heads/master $sha1 

# we can remove packed-refs safely, as we know that
# its only value is now stale. Real git would not do
# this, but we are simulating the case that master
# simply wasn't included in the last packed-refs file.
rm -f .git/packed-refs 

# and now we repack, which will create an up-to-date
# packed-refs file, and then delete the loose ref
git pack-refs --all --prune
  done

  # in another terminal, enumerate and make sure we never miss the ref
  cd repo 
  while true; do
refs=`git.compile for-each-ref --format='%(refname)'`
echo == $refs
test -z $refs  break
  done

It usually takes about 30 seconds to hit a problem, though I measured
failures at up to 90 seconds. With the patch below (on top of the one I
posted the other day, which refreshes the packed-refs cache in
get_packed_refs), it has been running fine for 15 minutes.

The noop_each_fn is a little gross. I could also just reimplement the
recursion from do_for_each_ref_in_dir (except we don't care about flags,
trim, base, etc, and would just be calling get_ref_dir recursively).
It's a slight repetition of code, but it would be less subtle than what
I have written below (which uses a no-op callback for the side effect
that it primes the loose ref cache). Which poison do you prefer?

diff --git a/refs.c b/refs.c
index 45a7ee6..59ae7e4 100644
--- a/refs.c
+++ b/refs.c
@@ -1363,19 +1363,38 @@ static int do_for_each_ref(const char *submodule, const 
char *base, each_ref_fn
for_each_rawref(warn_if_dangling_symref, data);
 }
 
+static int noop_each_fn(const char *ref, const unsigned char *sha1, int flags,
+   void *data)
+{
+   return 0;
+}
+
 static int do_for_each_ref(const char *submodule, const char *base, 
each_ref_fn fn,
   int trim, int flags, void *cb_data)
 {
struct ref_cache *refs = get_ref_cache(submodule);
-   struct ref_dir *packed_dir = get_packed_refs(refs);
-   struct ref_dir *loose_dir = get_loose_refs(refs);
+   struct ref_dir *packed_dir;
+   struct ref_dir *loose_dir;
int retval = 0;
 
-   if (base  *base) {
-   packed_dir = find_containing_dir(packed_dir, base, 0);
+   /*
+* Prime the loose ref cache; we must make sure the packed ref cache is
+* uptodate after we read the loose refs in order to avoid race
+* conditions with a simultaneous pack-refs --prune.
+*/
+   loose_dir = get_loose_refs(refs);
+   if (base  *base)
loose_dir = find_containing_dir(loose_dir, base, 0);
+   if (loose_dir) {
+   sort_ref_dir(loose_dir);
+   do_for_each_ref_in_dir(loose_dir, 0, base, noop_each_fn, 0, 0,
+  NULL);
}
 
+   packed_dir = get_packed_refs(refs);
+   if (base  *base)
+   packed_dir = find_containing_dir(packed_dir, base, 0);
+
if (packed_dir  loose_dir) {
sort_ref_dir(packed_dir);
sort_ref_dir(loose_dir);


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] Enable minimal stat checking

2013-05-06 Thread Jeff King
On Tue, Jan 22, 2013 at 08:49:22AM +0100, Robin Rosenberg wrote:

 Specifically the fields uid, gid, ctime, ino and dev are set to zero
 by JGit. Other implementations, eg. Git in cygwin are allegedly also
 somewhat incompatible with Git For Windows and on *nix platforms
 the resolution of the timestamps may differ.

This is an old commit, but I noticed a bug today...

 This change introduces a core.checkstat config option where the
 [...]
 +core.checkstat::
 [...]
 + if (!strcmp(var, core.statinfo)) {

One of these is not like the others. I didn't prepare a patch, though,
because I wasn't sure which it was supposed to be. A documentation bug
or a code bug? :)

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] Make $remote/$branch work with unconventional refspecs

2013-05-06 Thread Johan Herland
On Mon, May 6, 2013 at 7:06 PM, Junio C Hamano gits...@pobox.com wrote:
 Johan Herland jo...@herland.net writes:
 Let me try to summarize my views on how refnames should work in Git, to
 see if we can identify where we differ on the principles (or if we, in
 fact, differ at all):

 Thanks; I think I already said where I think we differ in a separate
 message, but a short version is that the point of remote.$nick.fetch
 mapping is to solve The remote may call a ref $this, which is not
 the refname I want to or can use in my repository, so here is the
 rule to use when importing it in my local namespace.  With the
 mapping, I can name the ref in my local namespace conveniently.
 E.g. their refs/heads/master cannot be our refs/heads/master at
 the same time, so use refs/remotes/origin/master.

 The result of the above mapping, be it remotes/origin/master or
 remotes/origin/heads/master, should be designed to be useful for the
 local use of the ref in question.  If you further need to remap it
 when using it locally, there is something wrong in the mapping you
 defined in your remote.$nick.fetch mapping in the first place.

Ok, so whereas I consider the refspec to be king, and that the expansion
from convenient shorthands to full remote-tracking refnames should be
derived from the chosen refspec, you would (if I understand you correctly)
rather have a constant (i.e. independent of remotes and refspecs) set of
rules for expanding shorthands to full refnames, and if the user chooses
refspecs that don't mesh well with those rules, then that is the user's
problem, and not Git's.

 We do not force any structure under refs/remotes/; it is left
 entirely up to the user, even though we would like to suggest the
 best current practice by teaching clone and remote add to lay
 them out in a certain way.

If we were to suggest +refs/heads/*:refs/remotes/origin/heads/* as the
best practice, I assume you do want origin/master to keep working. And
since you do not want to use the configured refspec when expanding
origin/master into refs/remotes/origin/heads/master, then I assume
you would rather add a hardcoded (what I call a textual expansion in
my patches) rule that would map $nick/$name into

  /refs/remotes/$nick/heads/$name

But isn't the existence of such a rule evidence of us trying to impose
(or at least hint) at a certain structure for refs/remotes/*?

In light of this, I'm interested in your thoughts about the following
related problem that I've just started looking at:

git branch -r shows the remote-tracking branches in this repo. Currently,
AFAICS, this just spits out all refs under refs/remotes/*. This behavior
must clearly be modified if we are to allow remote-tracking tags at
refs/remotes/$remote/tags/* (they currently show up in git branch -r,
but shouldn't). One could say that the filter should merely change from
refs/remotes/* to refs/remotes/*/heads/*, but this would break for
existing (old-style) remotes. Should we add a heuristic for detecting when
to use refs/remotes/* vs. refs/remotes/*/heads/* as a filter?

My approach would be to iterate through the configured remotes, and for
each remote list all refs that match the RHS of the refspec whose LHS is
refs/heads/*. This would work for both old- and new-style remotes with
no heuristics.

If you agree that my approach is correct for enumerating remote-tracking
branches, then what is different about using the refspec when expanding
remote-tracking refs in general?

In other words, given the following configuration:

  [remote origin]
  +refs/heads/*:refs/foo/bar/baz/*
  [remote foo]
  +refs/heads/*:refs/remotes/origin/heads/*

1. In your opininon, is refs/foo/bar/baz/master a remote-tracking branch?

2. Should refs/foo/bar/baz/master be listed by git branch -r?

3. Should the origin/master shorthand notation expand to
   refs/remotes/origin/heads/master from remote foo, or
   refs/foo/bar/baz/master from remote origin?


...Johan

-- 
Johan Herland, jo...@herland.net
www.herland.net
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] revision.c: Fix a sparse warning

2013-05-06 Thread Junio C Hamano
Kevin Bracey ke...@bracey.fi writes:

 On 04/05/2013 20:25, Ramsay Jones wrote:
 Sparse issues an 'sole_interesting' not declared. Should it be static?
 warning. In order to suppress the warning, since this symbol does not
 need more than file visibility, we simply add the static modifier to
 its declaration.


 Thanks! I'll include that fix.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/7] t7900: Start testing usability of namespaced remote refs

2013-05-06 Thread Junio C Hamano
Johan Herland jo...@herland.net writes:

 +test_expect_success 'work-around clone with namespaced remote refs' '
 + rm -rf client 
 + git init client 
 + (
 + cd client 
 + git remote add origin ../server 
 + git config --unset-all remote.origin.fetch 
 + git config --add remote.origin.fetch 
 +refs/heads/*:refs/remotes/origin/heads/* 

If you were to do this, I think you should drop the remote add
origin step and illustrate what configuration variables should be
prepared (at least minimally---the final implementation of git
clone --separate-remote-layout may add some other configuration
variable as a hint to say this remote is using the new layout or
somesuch) in this client repository.

That would make the test more self documenting.

I am not convinced that it is a good idea to reuse remotes/origin
hierarchy which traditionally has been branches-only like this,
though.  It may be better to use

refs/$remotes_new_layout/origin/{heads,tags,...}/*

for a value of $remotes_new_layout that is different from remote,
and teach the dwim_ref() machinery to pay attention to it, to avoid
confusion.  Otherwise, you wouldn't be able to tell between a topic
branch that works on tags named tags/refactor under the old layout,
and a tag that marks a good point in a refactoring effort refactor
under the new layout.

 + git config --add remote.origin.fetch 
 +refs/tags/*:refs/remotes/origin/tags/* 
 + git config --add remote.origin.fetch 
 +refs/notes/*:refs/remotes/origin/notes/* 
 + git config --add remote.origin.fetch 
 +refs/replace/*:refs/remotes/origin/replace/* 
 + git config remote.origin.tagopt --no-tags 
 + git fetch 
 + git checkout master
 + ) 
 + test_clone client
 +'
 +
 +test_done
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/7] t7900: Demonstrate failure to expand $remote/$branch according to refspecs

2013-05-06 Thread Junio C Hamano
Johan Herland jo...@herland.net writes:

 This test verifies that the following expressions all evaluate to the
 full refname refs/remotes/origin/heads/master:

As I've aleady said, I am not convinced that local refname
resolution should pay attention to refspec mapping, so I won't look
at this step.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/7] refs.c: Refactor rules for expanding shorthand names into full refnames

2013-05-06 Thread Junio C Hamano
Johan Herland jo...@herland.net writes:

 diff --git a/refs.c b/refs.c
 index 7231f54..8b02140 100644
 --- a/refs.c
 +++ b/refs.c
 @@ -1724,7 +1724,24 @@ const char *prettify_refname(const char *name)
   0);
  }
  
 -const char *ref_rev_parse_rules[] = {
 +static void ref_expand_txtly(const struct ref_expand_rule *rule,
 +  char *dst, size_t dst_len,
 +  const char *shortname, size_t shortname_len)
 +{
 + mksnpath(dst, dst_len, rule-pattern, shortname_len, shortname);
 +}
 +
 +const struct ref_expand_rule ref_expand_rules[] = {
 + { ref_expand_txtly, %.*s },
 + { ref_expand_txtly, refs/%.*s },
 + { ref_expand_txtly, refs/tags/%.*s },
 + { ref_expand_txtly, refs/heads/%.*s },
 + { ref_expand_txtly, refs/remotes/%.*s },
 + { ref_expand_txtly, refs/remotes/%.*s/HEAD },
 + { NULL, NULL }
 +};
 +
 +static const char *ref_rev_parse_rules[] = {
   %.*s,
   refs/%.*s,
   refs/tags/%.*s,
 @@ -1734,15 +1751,17 @@ const char *ref_rev_parse_rules[] = {
   NULL
  };
  
 -int refname_match(const char *abbrev_name, const char *full_name, const char 
 **rules)
 +int refname_match(const char *abbrev_name, const char *full_name,
 +   const struct ref_expand_rule *rules)
  {
 - const char **p;
 + const struct ref_expand_rule *p;
   const int abbrev_name_len = strlen(abbrev_name);
 + char n[PATH_MAX];

Hmmm, is it too much to ask to do this without a fixed length
buffer?  I think we have long learned the value of using strbuf to
avoid having to worry about buffer overruns.

I am OK with the idea to make ref_expand_rules[] customizable, and
the overall strategy taken by this step to refactor the current code
looks reasonably sensible.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/7] refs.c: Refactor code for shortening full refnames into shorthand names

2013-05-06 Thread Junio C Hamano
Johan Herland jo...@herland.net writes:

 This patch removes the only remaining user of ref_rev_parse_rules.
 It has now been fully replaced by ref_expand_rules. Hence this patch
 also removes ref_rev_parse_rules.

Yeah, I was wondering when you would do this while reading [4/7], as
removing the parse_rules[] would break shortener side, and leaving
it in for long would risk allowing parse_rules[] and expand_rules[]
to drift apart.

 -const struct ref_expand_rule ref_expand_rules[] = {
 - { ref_expand_txtly, %.*s },
 - { ref_expand_txtly, refs/%.*s },
 - { ref_expand_txtly, refs/tags/%.*s },
 - { ref_expand_txtly, refs/heads/%.*s },
 - { ref_expand_txtly, refs/remotes/%.*s },
 - { ref_expand_txtly, refs/remotes/%.*s/HEAD },
 - { NULL, NULL }
 -};

I wonder if you planned the previous step a bit better, this removal
of a large block of text could have come next to the replacement of
it we see after the addition of ref_shorten_txtly() function.

 +static char *ref_shorten_txtly(const struct ref_expand_rule *rule,
 +const char *refname)
 +{
 +...
 +}
  
 -static const char *ref_rev_parse_rules[] = {
 - %.*s,
 - refs/%.*s,
 - refs/tags/%.*s,
 - refs/heads/%.*s,
 - refs/remotes/%.*s,
 - refs/remotes/%.*s/HEAD,
 - NULL
 +const struct ref_expand_rule ref_expand_rules[] = {
 + { ref_expand_txtly, NULL, %.*s },
 + { ref_expand_txtly, ref_shorten_txtly, refs/%.*s },
 + { ref_expand_txtly, ref_shorten_txtly, refs/tags/%.*s },
 + { ref_expand_txtly, ref_shorten_txtly, refs/heads/%.*s },
 + { ref_expand_txtly, ref_shorten_txtly, refs/remotes/%.*s },
 + { ref_expand_txtly, ref_shorten_txtly, refs/remotes/%.*s/HEAD },
 + { NULL, NULL, NULL }
  };
  
  int refname_match(const char *abbrev_name, const char *full_name,

  char *shorten_unambiguous_ref(const char *refname, int strict)
  {
   int i;
   char *short_name;
  
 - /* buffer for scanf result, at most refname must fit */
 - short_name = xstrdup(refname);
 -
 - /* skip first rule, it will always match */
 - for (i = ARRAY_SIZE(ref_rev_parse_rules) - 1; i  0 ; --i) {
 + for (i = ARRAY_SIZE(ref_expand_rules) - 1; i = 0 ; --i) {
 ...
   /*
* in strict mode, all (except the matched one) rules
* must fail to resolve to a valid non-ambiguous ref
*/
   if (strict)
 - rules_to_fail = ARRAY_SIZE(ref_rev_parse_rules);
 + rules_to_fail = ARRAY_SIZE(ref_expand_rules);

This part obviously depends on 1/7; do we still have an off-by-one
change from the original, or did I miscount when I reviewed 1/7?

Again, the overall strategy to refactor sounds sound.

It may be a lot simpler if you have ref_expand/shorten_append() and
ref_expand/shortn_append_with_HEAD() built-in helper functions.
Then you can perform the expansion and contraction without %.*s at
all.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/7] refname_match(): Caller must declare if we're matching local or remote refs

2013-05-06 Thread Junio C Hamano
Johan Herland jo...@herland.net writes:

 refname_match() is used to check whether a given shorthand name matches a
 given full refname, but that full refname does not always belong in the
 local repo, rather it is sometimes taken from list of refs sent over from
 a remote repo.

That local vs remote is a wrong way to think about things.

All refs you can feed to resolve_ref() and dwim_ref() are local, and
remotes is not all that special.  It is just a convention to store
many different kind of things per different hierarchy, and the only
special hierarchy we have is refs/heads/* that can be updated by
making new commits through the index and the working tree.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] fast-export: improve speed by skipping blobs

2013-05-06 Thread Junio C Hamano
Felipe Contreras felipe.contre...@gmail.com writes:

 How?

  * if we teach fast-import to optionally not write marks for blobs
and trees out, your remote-bzr can take advantage of it,

 I already said remote-bzr is irrelevant. *Everybody* benefits.

Everybody who does _not_ need to look at marks for non-commits from
previous run does.  What about the others who do?

Surely, some lucky ones may get the benefit of a new optimization
for free if you turn it on uncondtionally without an escape hatch,
but that goes against our goal of not to knowingly introduce any
regression.  Michael's cvs2git might have a way to work the breakage
around (I will let him comment on your suggested workaround), but as
long as he has been coding it using the documented feature, why
should he even change anything for no gain at all in the first
place?  Even if you have a workaround, that does not change the fact
that a removal of a feature that somebody has been depending on is a
regression.

What's so difficult to understand that by default the responsibility
of making sure an optimization applies safely to a program that uses
a new optmization lies on that program, in other words, a new
feature is by default an opt-in?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git apply --interactive

2013-05-06 Thread Mark Lodato
I'd love for git apply to have an equivalent to git add -p.  (I'm
guessing it would be called --interactive since git apply -p is
taken and --patch would be confusing.)  Has anyone thought about this?

After taking a quick look at git-add--interactive.perl, it seems like
the main steps would be:
1) making sure the patch comes from a file and not stdin, since stdin
is required for the interactive input
2) collecting the input patch(es) and separating by output file
3) modifying parse_diff to accept these pre-split patches
4) coloring the patch

Mark
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] Make $remote/$branch work with unconventional refspecs

2013-05-06 Thread Junio C Hamano
Johan Herland jo...@herland.net writes:

 Ok, so whereas I consider the refspec to be king, and that the expansion
 from convenient shorthands to full remote-tracking refnames should be
 derived from the chosen refspec, you would (if I understand you correctly)
 rather have a constant (i.e. independent of remotes and refspecs) set of
 rules for expanding shorthands to full refnames, and if the user chooses
 refspecs that don't mesh well with those rules, then that is the user's
 problem, and not Git's.

You need to dig your rhetoric from the other end of the tunnel.  I
would consider the local namespace for the refs to be the king,
and we would design how they are resolved to be useful for user's
local usage pattern.  How the remote refs are copied by fetch
following refspecs is one of the many components (e.g. the dwimming
machinery checkout foo uses to guess that the user may want to
fork from and integrate later with one of the refs under refs/remotes
is one of them and it is not fetch) to complement the refname
resolution rule to support the local usage of refs.

 In light of this, I'm interested in your thoughts about the following
 related problem that I've just started looking at:

 git branch -r shows the remote-tracking branches in this repo. Currently,
  Should we add a heuristic for detecting when
 to use refs/remotes/* vs. refs/remotes/*/heads/* as a filter?

Didn't I already said that I do not think repurposing refs/remotes/
for these unified copies is the best approach?

A change that I think is a good thing to add on top of your [45]/7
refactoring is to allow the user to add custom expansion/contraction
rules.  Then the user can group refs regardless of remotes and
give meaningful shortening.

As I said in a very early review, viewing fetch refspec to be
king and refspecs are the only way the user may want to group
things locally is myopic.  The every-day usage of the local names
ought to be the king, and everything else should serve to make it
easier to use.


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4] fix packed-refs races

2013-05-06 Thread Jeff King
This fixes the races I brought up in the surrounding thread:

  http://thread.gmane.org/gmane.comp.version-control.git/223299

The individual races are describe in more detail in each commit, but for
reference, here is the complete reproduction recipe (which I posted
already in several parts throughout the thread, but is collected here):

  # base.sh
  # run this script forever in one terminal
  git init -q repo 
  cd repo 
  git commit -q --allow-empty -m one 
  one=`git rev-parse HEAD` 
  git commit -q --allow-empty -m two 
  two=`git rev-parse HEAD` 
  sha1=$one 
  while true; do
# this re-creates the loose ref in .git/refs/heads/master
if test $sha1 = $one; then
  sha1=$two
else
  sha1=$one
fi 
git update-ref refs/heads/master $sha1 

# we can remove packed-refs safely, as we know that
# its only value is now stale. Real git would not do
# this, but we are simulating the case that master
# simply wasn't included in the last packed-refs file.
rm -f .git/packed-refs 

# and now we repack, which will create an up-to-date
# packed-refs file, and then delete the loose ref
git pack-refs --all --prune
  done

  # lookup.sh
  # run this script simultaneously in another terminal; when it exits,
  # git is broken (it erroneously told us that master did not exist).
  # It is fixed by applying up to patch 3/4 below.
  cd repo 
  while true; do
ref=`git rev-parse --verify master`
echo == $ref
test -z $ref  break
  done

  # enumerate.sh
  # run this script simultaneously in another terminal (either with
  # lookup.sh running, too, or just base.sh); when it exits, we
  # for-each-ref has erroneously missed master. It is fixed by applying
  # up to patch 4/4 below.
  cd repo 
  while true; do
refs=`git.compile for-each-ref --format='%(refname)'`
echo == $refs
test -z $refs  break
  done

I don't think is a purely hypothetical race. About once a month for the
past six months or so, I'll run across a corrupted repository on
github.com that is inexplicably missing a few objects near the tip of a
ref. This last time, I happened to catch something very odd in our ref
audit-log. The log for the repo that became corrupted showed a small
push to the tip of master, but the objects for that push were missing
(the previous ref tip was fine). At almost the exact same time, our
alternates repo was running a fetch --prune from the corrupted repo.
But it didn't see the corrupted master ref; it saw that
refs/heads/master didn't exist at all, and it deleted it.

I believe that git-upload-pack hit the enumeration race above, and
showed a ref advertisement in which refs/heads/master was missing[1],
leading fetch to delete the ref. At the same time, a gc was running in
the corrupted repo which hit the same issue, and ended up pruning[2] the
newly pushed objects, which were not referenced from anywhere else.

It may seem unlikely for two programs to hit the race at the same time
(especially given the number of iterations you need to trigger the race
with the scripts above), but I think it's exacerbated when you have a
very large number of refs (because the loose enumeration takes much
longer). So I think there's a good chance this was my problem. And if
not, it is good to fix anyway. :)

  [1/4]: resolve_ref: close race condition for packed refs
  [2/4]: add a stat_validity struct
  [3/4]: get_packed_refs: reload packed-refs file when it changes
  [4/4]: for_each_ref: load all loose refs before packed refs

-Peff

[1] Usually the enumeration race would cause you to see the old value
from the packed-refs file, not a completely missing ref (unless the
ref was not packed at all previously, but that is almost certainly
not the case here). But it does call into resolve_ref to actually
read the ref, which has its own race that can cause refs to
disappear (and is fixed by patch 1).

As an aside, I think that calling resolve_ref is probably wrong. We
know we have a fully qualified refname, because we keep track of
which directory we are calling readdir() on. But if for whatever
reason somebody deletes it at the exact moment between when we see
it in readdir() and when we try to open it (not packs it, but
actually deletes it), then we'll see it is missing and fall back to
trying other ref-lookups. So in theory a race with a delete of
refs/heads/foo could accidentally retrieve the value for
refs/heads/refs/heads/foo.

It's quite unlikely, though, and I'm not too concerned about it.

[2] Recently pushed objects should still be safe, even if they are
unreachable. However, in our case the pruning was exacerbated by a
gc codepath that could call git repack -ad (not in upstream git,
but we do our own custom gc due to the alternates structure of our
forks). I've also fixed that, as we should always be using -A to
keep recent unreachable objects.
--
To unsubscribe from this list: send the 

[PATCH 1/4] resolve_ref: close race condition for packed refs

2013-05-06 Thread Jeff King
When we attempt to resolve a ref, the first thing we do is
call lstat() to see if it is a symlink or a real ref. If we
find that the ref is missing, we fall back to looking it up
in the packed-refs file. If we find the loose ref does exist
(and is a regular file), we continue with trying to open it.

However, we do not do the same fallback if our open() call
fails; we just report the ref as missing.  A git pack-refs
--prune process which is simultaneously running may remove
the loose ref between our lstat() and open().  In this case,
we would erroneously report the ref as missing, even though
we could find it if we checked the packed-refs file.

This patch solves it by factoring out the fallback code from
the lstat() case and calling it from both places. We do not
need to do the same thing for the symlink/readlink code
path, even though it might receive ENOENT, too, because
symlinks cannot be packed (so if it disappears after the
lstat, it is truly being deleted).

Note that this solves only the part of the race within
resolve_ref_unsafe. In the situation described above, we may
still be depending on a cached view of the packed-refs file;
that race will be dealt with in a future patch.

Signed-off-by: Jeff King p...@peff.net
---

 refs.c | 63 ++-
 1 file changed, 42 insertions(+), 21 deletions(-)

diff --git a/refs.c b/refs.c
index de2d8eb..5a14703 100644
--- a/refs.c
+++ b/refs.c
@@ -1083,6 +1083,43 @@ static int get_packed_ref(const char *refname, unsigned 
char *sha1)
return -1;
 }
 
+/*
+ * This should be called from resolve_ref_unsafe when a loose ref cannot be
+ * accessed; err must represent the errno from the last attempt to access the
+ * loose ref, and the other options are forwarded from resolve_safe_unsaefe.
+ */
+static const char *handle_loose_ref_failure(int err,
+   const char *refname,
+   unsigned char *sha1,
+   int reading,
+   int *flag)
+{
+   /*
+* If we didn't get ENOENT, something is broken
+* with the loose ref, and we should not fallback,
+* but instead pretend it doesn't exist.
+*/
+   if (err != ENOENT)
+   return NULL;
+
+   /*
+* The loose reference file does not exist;
+* check for a packed reference.
+*/
+   if (!get_packed_ref(refname, sha1)) {
+   if (flag)
+   *flag |= REF_ISPACKED;
+   return refname;
+   }
+
+   /* The reference is not a packed reference, either. */
+   if (reading)
+   return NULL;
+
+   hashclr(sha1);
+   return refname;
+}
+
 const char *resolve_ref_unsafe(const char *refname, unsigned char *sha1, int 
reading, int *flag)
 {
int depth = MAXDEPTH;
@@ -1107,26 +1144,9 @@ const char *resolve_ref_unsafe(const char *refname, 
unsigned char *sha1, int rea
 
git_snpath(path, sizeof(path), %s, refname);
 
-   if (lstat(path, st)  0) {
-   if (errno != ENOENT)
-   return NULL;
-   /*
-* The loose reference file does not exist;
-* check for a packed reference.
-*/
-   if (!get_packed_ref(refname, sha1)) {
-   if (flag)
-   *flag |= REF_ISPACKED;
-   return refname;
-   }
-   /* The reference is not a packed reference, either. */
-   if (reading) {
-   return NULL;
-   } else {
-   hashclr(sha1);
-   return refname;
-   }
-   }
+   if (lstat(path, st)  0)
+   return handle_loose_ref_failure(errno, refname, sha1,
+   reading, flag);
 
/* Follow normalized - ie refs/.. symlinks by hand */
if (S_ISLNK(st.st_mode)) {
@@ -1156,7 +1176,8 @@ const char *resolve_ref_unsafe(const char *refname, 
unsigned char *sha1, int rea
 */
fd = open(path, O_RDONLY);
if (fd  0)
-   return NULL;
+   return handle_loose_ref_failure(errno, refname, sha1,
+   reading, flag);
len = read_in_full(fd, buffer, sizeof(buffer)-1);
close(fd);
if (len  0)
-- 
1.8.3.rc1.2.g12db477

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

[PATCH 3/4] get_packed_refs: reload packed-refs file when it changes

2013-05-06 Thread Jeff King
Once we read the packed-refs file into memory, we cache it
to save work on future ref lookups. However, our cache may
be out of date with respect to what is on disk if another
process is simultaneously packing the refs. Normally it
is acceptable for us to be a little out of date, since there
is no guarantee whether we read the file before or after the
simultaneous update. However, there is an important special
case: our packed-refs file must be up to date with respect
to any loose refs we read. Otherwise, we risk the following
race condition:

  0. There exists a loose ref refs/heads/master.

  1. Process A starts and looks up the ref master. It
 first checks $GIT_DIR/master, which does not exist. It
 then loads (and caches) the packed-refs file to see if
 master exists in it, which it does not.

  2. Meanwhile, process B runs pack-refs --all --prune. It
 creates a new packed-refs file which contains
 refs/heads/master, and removes the loose copy at
 $GIT_DIR/refs/heads/master.

  3. Process A continues its lookup, and eventually tries
 $GIT_DIR/refs/heads/master.  It sees that the loose ref
 is missing, and falls back to the packed-refs file. But
 it examines its cached version, which does not have
 refs/heads/master. After trying a few other prefixes,
 it reports master as a non-existent ref.

There are many variants (e.g., step 1 may involve process A
looking up another ref entirely, so even a fully qualified
refname can fail). One of the most interesting ones is if
refs/heads/master is already packed. In that case process
A will not see it as missing, but rather will report
whatever value happened to be in the packed-refs file before
process B repacked (which might be an arbitrarily old
value).

We can fix this by making sure we reload the packed-refs
file from disk after looking at any loose refs. That's
unacceptably slow, so we can check it's stat()-validity as a
proxy, and read it only when it appears to have changed.

Reading the packed-refs file after performing any loose-ref
system calls is sufficient because we know the ordering of
the pack-refs process: it always makes sure the newly
written packed-refs file is installed into place before
pruning any loose refs. As long as those operations by B
appear in their executed order to process A, by the time A
sees the missing loose ref, the new packed-refs file must be
in place.

Signed-off-by: Jeff King p...@peff.net
---
I hooked the refreshing into get_packed_refs, since then all callers get
it for free. It makes me a little nervous, though, just in case some
caller really cares about calling get_packed_refs but not having the
list of packed-refs change during the call. peel_ref looks like such a
function, but isn't, for reasons I'll explain in a followup patch.

Clone also looks like such a caller, as it calls get_packed_refs once
for each upstream ref it adds (it puts them in the packed-refs list, and
then writes them all out at the end). But it's OK because there is no
packed-refs file for it to refresh from.

An alternative would be to move the refreshing to an explicit
refresh_packed_refs() function, and call it from a few places
(resolve_ref, and later from do_for_each_ref).

 refs.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/refs.c b/refs.c
index 5a14703..6afe8cc 100644
--- a/refs.c
+++ b/refs.c
@@ -708,6 +708,7 @@ static struct ref_cache {
struct ref_cache *next;
struct ref_entry *loose;
struct ref_entry *packed;
+   struct stat_validity packed_validity;
/* The submodule name, or  for the main repo. */
char name[FLEX_ARRAY];
 } *ref_cache;
@@ -717,6 +718,7 @@ static void clear_packed_ref_cache(struct ref_cache *refs)
if (refs-packed) {
free_ref_entry(refs-packed);
refs-packed = NULL;
+   stat_validity_clear(refs-packed_validity);
}
 }
 
@@ -878,17 +880,25 @@ static struct ref_dir *get_packed_refs(struct ref_cache 
*refs)
 
 static struct ref_dir *get_packed_refs(struct ref_cache *refs)
 {
+   const char *packed_refs_file;
+
+   if (*refs-name)
+   packed_refs_file = git_path_submodule(refs-name, 
packed-refs);
+   else
+   packed_refs_file = git_path(packed-refs);
+
+   if (refs-packed 
+   !stat_validity_check(refs-packed_validity, packed_refs_file))
+   clear_packed_ref_cache(refs);
+
if (!refs-packed) {
-   const char *packed_refs_file;
FILE *f;
 
refs-packed = create_dir_entry(refs, , 0, 0);
-   if (*refs-name)
-   packed_refs_file = git_path_submodule(refs-name, 
packed-refs);
-   else
-   packed_refs_file = git_path(packed-refs);
+
f = fopen(packed_refs_file, r);
if (f) {
+   stat_validity_update(refs-packed_validity, 

[PATCH 4/4] for_each_ref: load all loose refs before packed refs

2013-05-06 Thread Jeff King
If we are iterating through the refs using for_each_ref (or
any of its sister functions), we can get into a race
condition with a simultaneous pack-refs --prune that looks
like this:

  0. We have a large number of loose refs, and a few packed
 refs. refs/heads/z/foo is loose, with no matching entry
 in the packed-refs file.

  1. Process A starts iterating through the refs. It loads
 the packed-refs file from disk, then starts lazily
 traversing through the loose ref directories.

  2. Process B, running pack-refs --prune, writes out the
 new packed-refs file. It then deletes the newly packed
 refs, including refs/heads/z/foo.

  3. Meanwhile, process A has finally gotten to
 refs/heads/z (it traverses alphabetically). It
 descends, but finds nothing there.  It checks its
 cached view of the packed-refs file, but it does not
 mention anything in refs/heads/z/ at all (it predates
 the new file written by B in step 2).

The traversal completes successfully without mentioning
refs/heads/z/foo at all (the name, of course, isn't
important; but the more refs you have and the farther down
the alphabetical list a ref is, the more likely it is to hit
the race). If refs/heads/z/foo did exist in the packed refs
file at state 0, we would see an entry for it, but it would
show whatever sha1 the ref had the last time it was packed
(which could be an arbitrarily long time ago).

This can be especially dangerous when process A is git
prune, as it means our set of reachable tips will be
incomplete, and we may erroneously prune objects reachable
from that tip (the same thing can happen if repack -ad is
used, as it simply drops unreachable objects that are
packed).

This patch solves it by loading all of the loose refs for
our traversal into our in-memory cache, and then refreshing
the packed-refs cache. Because a pack-refs writer will
always put the new packed-refs file into place before
starting the prune, we know that any loose refs we fail to
see will either truly be missing, or will have already been
put in the packed-refs file by the time we refresh.

Signed-off-by: Jeff King p...@peff.net
---
So this cache-priming is ugly, but I don't think it will have a
performance impact, as it his hitting only directories and
files that we were about to see in a second anyway during
the real traversal. The only difference is that there is a
slight latency before we would start producing any output,
as we read the whole loose refs tree (well, the part we are
interested for this call) before processing any refs.

In theory we could do our lazy walk, and then at the end
double-check that packed-refs had not changed. But if it
does change, we couldn't output any newly found refs, then;
they would be out of sorted order. The only thing we could
do is die().  Which might be OK, and is certainly preferable
to silently dropping a ref and indicating that we processed
the whole list. But I think the latency is probably not that
big a deal.

 refs.c | 38 +-
 1 file changed, 33 insertions(+), 5 deletions(-)

diff --git a/refs.c b/refs.c
index 6afe8cc..c127baf 100644
--- a/refs.c
+++ b/refs.c
@@ -641,6 +641,21 @@ static int do_for_each_ref_in_dirs(struct ref_dir *dir1,
 }
 
 /*
+ * Load all of the refs from the dir into our in-memory cache. The hard work
+ * of loading loose refs is done by get_ref_dir(), so we just need to recurse
+ * through all of the sub-directories. We do not even need to care about
+ * sorting, as traversal order does not matter to us.
+ */
+static void prime_ref_dir(struct ref_dir *dir)
+{
+   int i;
+   for (i = 0; i  dir-nr; i++) {
+   struct ref_entry *entry = dir-entries[i];
+   if (entry-flag  REF_DIR)
+   prime_ref_dir(get_ref_dir(entry));
+   }
+}
+/*
  * Return true iff refname1 and refname2 conflict with each other.
  * Two reference names conflict if one of them exactly matches the
  * leading components of the other; e.g., foo/bar conflicts with
@@ -1351,14 +1366,27 @@ static int do_for_each_ref(const char *submodule, const 
char *base, each_ref_fn
   int trim, int flags, void *cb_data)
 {
struct ref_cache *refs = get_ref_cache(submodule);
-   struct ref_dir *packed_dir = get_packed_refs(refs);
-   struct ref_dir *loose_dir = get_loose_refs(refs);
+   struct ref_dir *packed_dir;
+   struct ref_dir *loose_dir;
int retval = 0;
 
-   if (base  *base) {
-   packed_dir = find_containing_dir(packed_dir, base, 0);
+   /*
+* We must make sure that all loose refs are read before accessing the
+* packed-refs file; this avoids a race condition in which loose refs
+* are migrated to the packed-refs file by a simultaneous process, but
+* our in-memory view is from before the migration. get_packed_refs()
+* takes care of making sure our view is up to date with what is on
+  

[PATCH 0/2] peel_ref cleanups changes

2013-05-06 Thread Jeff King
On Mon, May 06, 2013 at 10:43:13PM -0400, Jeff King wrote:

 I hooked the refreshing into get_packed_refs, since then all callers get
 it for free. It makes me a little nervous, though, just in case some
 caller really cares about calling get_packed_refs but not having the
 list of packed-refs change during the call. peel_ref looks like such a
 function, but isn't, for reasons I'll explain in a followup patch.

I thought I would need the cleanups below for peel_ref, but it turns out
that the middle chunk of the function is never exercised. They conflict
textually with Michael's mh/packed-refs-various topic, so I think it may
make sense to just rebase them on top of that.

  [1/2]: peel_ref: rename sha1 argument to peeled
  [2/2]: peel_ref: refactor for safety with simultaneous update

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] peel_ref: rename sha1 argument to peeled

2013-05-06 Thread Jeff King
The sha1 argument of peel_ref is meant to hold the peeled
object name for output. Let's call it peeled which makes
it more clear that it is not an input sha1 (especially as we
will be adding such an input sha1 in the next patch).

Signed-off-by: Jeff King p...@peff.net
---
Simple cleanup for the next step.

 refs.c | 8 
 refs.h | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/refs.c b/refs.c
index de2d8eb..89f8141 100644
--- a/refs.c
+++ b/refs.c
@@ -1231,7 +1231,7 @@ static int filter_refs(const char *refname, const 
unsigned char *sha1, int flags
return filter-fn(refname, sha1, flags, filter-cb_data);
 }
 
-int peel_ref(const char *refname, unsigned char *sha1)
+int peel_ref(const char *refname, unsigned char *peeled)
 {
int flag;
unsigned char base[20];
@@ -1242,7 +1242,7 @@ int peel_ref(const char *refname, unsigned char *sha1)
if (current_ref-flag  REF_KNOWS_PEELED) {
if (is_null_sha1(current_ref-u.value.peeled))
return -1;
-   hashcpy(sha1, current_ref-u.value.peeled);
+   hashcpy(peeled, current_ref-u.value.peeled);
return 0;
}
hashcpy(base, current_ref-u.value.sha1);
@@ -1257,7 +1257,7 @@ int peel_ref(const char *refname, unsigned char *sha1)
struct ref_entry *r = find_ref(dir, refname);
 
if (r != NULL  r-flag  REF_KNOWS_PEELED) {
-   hashcpy(sha1, r-u.value.peeled);
+   hashcpy(peeled, r-u.value.peeled);
return 0;
}
}
@@ -1274,7 +1274,7 @@ fallback:
if (o-type == OBJ_TAG) {
o = deref_tag_noverify(o);
if (o) {
-   hashcpy(sha1, o-sha1);
+   hashcpy(peeled, o-sha1);
return 0;
}
}
diff --git a/refs.h b/refs.h
index a35eafc..1e8b4e1 100644
--- a/refs.h
+++ b/refs.h
@@ -61,7 +61,7 @@ extern int ref_exists(const char *);
 
 extern int ref_exists(const char *);
 
-extern int peel_ref(const char *refname, unsigned char *sha1);
+extern int peel_ref(const char *refname, unsigned char *peeled);
 
 /** Locks a refs/ ref returning the lock on success and NULL on failure. **/
 extern struct ref_lock *lock_ref_sha1(const char *refname, const unsigned char 
*old_sha1);
-- 
1.8.3.rc1.2.g12db477

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] fast-import: only store commit objects

2013-05-06 Thread Michael Haggerty
On 05/06/2013 11:19 PM, Felipe Contreras wrote:
 On Mon, May 6, 2013 at 10:18 AM, Junio C Hamano gits...@pobox.com wrote:
 Michael Haggerty mhag...@alum.mit.edu writes:

 Yes, it can be handy to start loading the first blobfile in parallel
 with the later stages of the conversion, before the second dumpfile is
 ready.  In that case the user needs to pass --export-marks to the first
 fast-import process to export marks on blobs so that the marks can be
 passed to the second fast-import via --import-marks.

 So the proposed change would break a documented use of cvs2git.

 Making the export of blob marks optional would of course be OK, as long
 as the default is to export them.

 Thanks for a concise summary.  Your use case fits exactly what
 Felipe conjectured as the nonexistent minority.
 
 Not true. cvs2git does *not* rely on the blobs being stored in a marks
 file, because cvs2git does not rely on mark files at all.
 
 An option that lets the caller say I only care about marks on these
 types of objects to be written to (and read from) the exported marks
 file would help Felipe's use case without harming your use case,
 and would be a sane and safe way to go.
 
 His case is not harmed at all. It's only the unfortunate command that
 is mentioned in the documentation that didn't need to be mentioned at
 all in the first place.
 
 It should be the other way around, if it's only this documentation
 that is affected, we could add a switch for that particular command,
 and the documentation should be updated, but it's overkill to add a
 switch for one odd command in some documentation somewhere, it would
 be much better to update the odd command to avoid using marks at all,
 which is what the more appropriate command does, right below in the
 same documentation.
 
   cat ../cvs2svn-tmp/git-blob.dat ../cvs2svn-tmp/git-dump.dat | git 
 fast-import
 
 Should the rest of the real world be punished because somebody added a
 command in some documentation somewhere, which wasn't actually needed
 in the first place?

Don't get too fixated on the documentation.  The documentation just
gives some examples of how cvs2git can be used.

The reason that cvs2git outputs two files is that the first file is
emitted at the very beginning of the conversion and the second at the
very end.  These conversions can take a long time ( 1 day for very big
repos), can be interrupted and restarted between passes, and passes
can even be re-run with changed configurations.

CVS write access has to be turned off before the start of the final
conversion, so no VCS is possible until the conversion is over.  So
users are very interested in keeping the downtime minimal.  The blobfile
can also be unwieldy (its size is approximately the sum of the sizes of
all revisions of all files in the project).  Being able to load the
blobfile into one fast-import process and the dumpfile into a different
process (which relies on the feature that you propose removing) opens up
a lot of possibilities:

* The first fast-import of the blobfile can be started as soon as the
blobfile is complete and run in parallel with the rest of the conversion.

* If the blobfile needs to be transferred over the network (e.g.,
because Git will be served from a different server than the one doing
the conversion) the network transfer can also be done in parallel with
the rest of the conversion.

* The blobfile could be written to a named pipe that is being read by a
git-fast-import process, to avoid having to write the blobfile to disk
in the first place.

* The user could run git repack between loading the blobfile and
loading the dumpfile.

These are just the ways that cvs2git does and/or could benefit from the
flexibility that is now in git-fast-import.  Other tools might also be
using git-fast-import in ways that would be broken by your proposed change.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] peel_ref: refactor for safety with simultaneous update

2013-05-06 Thread Jeff King
The peel_ref function lets a caller peel an object, taking
advantage of the fact that we may have the peeled value
cached for a packed ref.  However, this function is not
necessarily safe in the face of simultaneous ref updates.

The caller has typically already loaded the ref from disk
(e.g., as part of a for_each_ref enumeration), and therefore
knows what the ref's base sha1 is. But if the asked-for ref
is not the current ref, we will load the ref from disk
ourselves. If another process is simultaneously updating the
ref, we may not get the same sha1 that the caller thinks is
in the ref, and as a result may return a peeled value that
does not match the sha1 that the caller has.

To make this safer, we can have the caller pass in its idea
of the sha1 of the ref, and use that as the base for peeling
(and we can also double-check that it matches the
current_ref pointer when we use that).

This makes the function harder to call (it is not really
peel this ref, but peel this object, and by the way, it
is called by this refname). However, none of the current
callers care, because they always call it from a
for_each_ref enumeration, meaning it is no extra work to
pass in the sha1.

Furthermore, that means that none of the current callers hit
the read_ref_full codepath at all (so this bug is not
currently triggerable). They either follow the current_ref
optimization, or if the ref is not packed, they go straight
to the deref_tag fallback.

Let's just rip out the unused code path entirely. Not only
is it not used now, but it would not even make sense to use:
you would get the peeled value of the ref, but you would
have no clue which base object led to that peel.

Signed-off-by: Jeff King p...@peff.net
---
So I think peel_ref is buggy as-is (in both master and in
mh/packed-refs-various), it's just not triggerable because of the
dead code path. And it would get buggier again with my other patch
series, as then get_packed_ref could actually re-read the packed-refs
file. But I really think peel_ref is a badly designed function.

You do not peel a ref; you peel an object. You might think it is
convenient to have a function do the ref lookup and the peeling
together, but in practice nobody wants that, because you would not know
what you had just peeled. The real point is to hit the current_ref
optimization. Michael's series already has factored out a peel_object
function with the relevant bits. I think we can simply do away with
peel_ref entirely, and move the current_ref optimization there. I think
it should be perfectly cromulent to do:

  int peel_object(const unsigned char *base, const unsigned char *peeled)
  {
  if (current_ref  current_ref-flag  REF_KNOWS_PEELED 
  !hashcmp(base, current_ref-u.value.sha1)) {
  hashcpy(peeled, current_ref-u.value.peeled);
  return 0;
  }

  /* peel as usual */
  }

That is, we do not need to care that it is _our_ ref that peels to that.
We happen to have a ref whose peeled value we know, and its object
matches the one we want to peel. Whether it is the same ref or not, the
same object sha1 should peel to the same peeled sha1. Of course, in
practice it will be our ref, because that is how the current callers
work. But it is also a safe function for callers who do not know or care
about the optimization.

My original patch on master is below for illustration, but do not look
too closely. I think the path I outlined above makes more sense, but I
am not going to work on it tonight.

 builtin/describe.c |  2 +-
 builtin/pack-objects.c |  4 ++--
 builtin/show-ref.c |  2 +-
 refs.c | 24 
 refs.h |  3 ++-
 upload-pack.c  |  2 +-
 6 files changed, 11 insertions(+), 26 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 6636a68..23d7f1a 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -150,7 +150,7 @@ static int get_name(const char *path, const unsigned char 
*sha1, int flag, void
return 0;
 
/* Is it annotated? */
-   if (!peel_ref(path, peeled)) {
+   if (!peel_ref(path, sha1, peeled)) {
is_annotated = !!hashcmp(sha1, peeled);
} else {
hashcpy(peeled, sha1);
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f069462..76352df 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -556,7 +556,7 @@ static int mark_tagged(const char *path, const unsigned 
char *sha1, int flag,
 
if (entry)
entry-tagged = 1;
-   if (!peel_ref(path, peeled)) {
+   if (!peel_ref(path, sha1, peeled)) {
entry = locate_object_entry(peeled);
if (entry)
entry-tagged = 1;
@@ -2032,7 +2032,7 @@ static int add_ref_tag(const char *path, const unsigned 
char *sha1, int flag, vo
unsigned char peeled[20];
 
if (!prefixcmp(path, refs/tags/)  /* is a 

Re: [PATCH 4/4] fast-import: only store commit objects

2013-05-06 Thread Michael Haggerty
On 05/06/2013 11:36 PM, Felipe Contreras wrote:
 This would simplify the documentation, and obliterate the need to use
 mark files at all:

As explained in my other email, this documentation change does not
remove all of the reasons that users might want to use mark files.  I
would still like to show users how they can load the files into Git as
two separate steps.

 diff -ur cvs2svn-2.4.0/www/cvs2git.html cvs2svn-2.4.0-mod/www/cvs2git.html
 --- cvs2svn-2.4.0/www/cvs2git.html2012-09-22 01:49:55.0 -0500
 +++ cvs2svn-2.4.0-mod/www/cvs2git.html2013-05-06 16:33:12.070189985 
 -0500
 @@ -355,14 +355,13 @@
fast-import/tt:/p
 
  pre
 -git fast-import --export-marks=../cvs2svn-tmp/git-marks.dat lt;
 ../cvs2svn-tmp/git-blob.dat
 -git fast-import --import-marks=../cvs2svn-tmp/git-marks.dat lt;
 ../cvs2svn-tmp/git-dump.dat
 +cat ../cvs2svn-tmp/git-blob.dat ../cvs2svn-tmp/git-dump.dat | git fast-import
  /pre
 
 -pOn Linux/Unix this can be shortened to:/p
 +pOn Windows you should use type instead:/p
 
  pre
 -cat ../cvs2svn-tmp/git-blob.dat ../cvs2svn-tmp/git-dump.dat | git fast-import
 +type ../cvs2svn-tmp/git-blob.dat ../cvs2svn-tmp/git-dump.dat | git 
 fast-import
  /pre
 
/li
 Only in cvs2svn-2.4.0-mod/www: .cvs2git.html.swp
 
 

Nevertheless, it *would* be nice to add the Windows equivalent of the
cat pipeline.  I knew about the type command but I was under the
impression that it is intended for text files and can corrupt binary
files.  Are you sure that using type as you suggest is binary-clean?

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] fast-import: only store commit objects

2013-05-06 Thread Michael Haggerty
On 05/06/2013 11:04 PM, Felipe Contreras wrote:
 On Mon, May 6, 2013 at 5:45 AM, Michael Haggerty mhag...@alum.mit.edu wrote:
 On 05/06/2013 12:32 PM, Thomas Rast wrote:
 Michael Haggerty mhag...@alum.mit.edu writes:

 On 05/03/2013 08:23 PM, Felipe Contreras wrote:
 On Fri, May 3, 2013 at 12:56 PM, Thomas Rast tr...@inf.ethz.ch wrote:
 Felipe Contreras felipe.contre...@gmail.com writes:

 How do we know that this doesn't break any users of fast-import?  Your
 comment isn't very reassuring:

 the vast majority of them will never be used again

 So what's with the minority?

 Actually I don't think there's any minority. If the client program
 doesn't store blobs, the blob marks are not used anyway. So there's no
 change.

 I haven't been following this conversation in detail, but your proposed
 change sounds like something that would break cvs2git [1].  Let me
 explain what cvs2git does and why:

 CVS stores all of the revisions of a single file in a single filename,v
 file in rcsfile(5) format.  The revisions are stored as deltas ordered
 so that a single revision can be reconstructed from a single serial read
 of the file.

 cvs2git reads each of these files once, reconstructing *all* of the
 revisions for a file in a single go.  It then pours them into a
 git-fast-import stream as blobs and sets a mark on each blob.

 Only much later in the conversion does it have enough information to
 reconstruct tree-wide commits.  At that time it outputs git-fast-import
 data (to a second file) defining the git commits and their ancestry.
 The contents are defined by referring to the marks of blobs from the
 first git-fast-import stream file.

 This strategy speeds up the conversion *enormously*.

 So if I understand correctly that you are proposing to stop allowing
 marks on blob objects to be set and/or referred to later, then I object
 vociferously.

 The proposed patch wants to stop writing marks (in --export-marks) for
 anything but commits.  Does cvs2git depend on that?  I.e., are you using
 two separate fast-import processes for the blob and tree/commit phases
 you describe above?

 Yes, it can be handy to start loading the first blobfile in parallel
 with the later stages of the conversion, before the second dumpfile is
 ready.  In that case the user needs to pass --export-marks to the first
 fast-import process to export marks on blobs so that the marks can be
 passed to the second fast-import via --import-marks.
 
 It can be used that way, but it doesn't have to be.

Please see my other email for an explanation of how the flexibility of
loading the two files separately can bring concrete benefits.

 So the proposed change would break a documented use of cvs2git.
 
 It's documented as an alternative. How many people actually use this
 form over the other? Is there really any advantage? It's possibly that
 basically nobody is using this form, and there's no benefits.

You conjectured earlier that nobody uses blob marks, and I provided a
counterexample.  Then you proposed a workaround that would require
changes to the cvs2git documentation, and I even explained how your
proposed workaround is not as flexible as the status quo.  Do you want
to go through the same argument with every possible user of git-fast-import?

It would be insanity to change the default behavior when a workaround on
the Git side (namely adding an option that tells git-fast-import *not*
to emit blob marks) would be quite straightforward to implement and have
little maintenance cost.

 Making the export of blob marks optional would of course be OK, as long
 as the default is to export them.
 
 Nobody benefits from leaving the default as it is.

Sure they do.  Any tool that *knows* that it doesn't need blob marks can
pass the new option and benefit.  Other tools benefit from not being
broken by your change.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >