[PATCH] teach fast-export an --anonymize option

2014-08-21 Thread Jeff King
Sometimes users want to report a bug they experience on
their repository, but they are not at liberty to share the
contents of the repository. It would be useful if they could
produce a repository that has a similar shape to its history
and tree, but without leaking any information. This
anonymized repository could then be shared with developers
(assuming it still replicates the original problem).

This patch implements an --anonymize option to
fast-export, which generates a stream that can recreate such
a repository. Producing a single stream makes it easy for
the caller to verify that they are not leaking any useful
information. You can get an overview of what will be shared
by running a command like:

  git fast-export --anonymize --all |
  perl -pe 's/\d+/X/g' |
  sort -u |
  less

which will show every unique line we generate, modulo any
numbers (each anonymized token is assigned a number, like
User 0, and we replace it consistently in the output).

In addition to anonymizing, this produces test cases that
are relatively small (compared to the original repository)
and fast to generate (compared to using filter-branch, or
modifying the output of fast-export yourself). Here are
numbers for git.git:

  $ time git fast-export --anonymize --all \
 --tag-of-filtered-object=drop output
  real0m2.883s
  user0m2.828s
  sys 0m0.052s

  $ gzip output
  $ ls -lh output.gz | awk '{print $5}'
  2.9M

Signed-off-by: Jeff King p...@peff.net
---
I haven't used this for anything real yet. It was a fun exercise, and I
do think it should work in practice. I'd be curious to hear a success
report of somebody actually debugging something with this.

In theory we could anonymize in a reversible way (e.g., by encrypting
each token with a key, and then not sharing the key), but it's a lot
more complicated and I don't think it buys us much. The one thing I'd
really like is to be able to test packing on an anonymized repository,
but two objects which delta well together will not have their encrypted
contents delta (unless you use something weak like ECB mode, in which
case the contents are not really as anonymized as you would hope).

I think most interesting cases involve things like commit traversal, and
that should still work here, even with made-up contents. Some weird
cases involving trees would not work if they depend on the filenames
(e.g., things that impact sort order). We could allow finer-grained
control, like --anonymize=commits,blobs if somebody was OK sharing
their filenames. I did not go that far here, but it should be pretty
easy to build on top.

 Documentation/git-fast-export.txt |   6 +
 builtin/fast-export.c | 280 --
 t/t9351-fast-export-anonymize.sh  | 117 
 3 files changed, 392 insertions(+), 11 deletions(-)
 create mode 100755 t/t9351-fast-export-anonymize.sh

diff --git a/Documentation/git-fast-export.txt 
b/Documentation/git-fast-export.txt
index 221506b..0ec7cad 100644
--- a/Documentation/git-fast-export.txt
+++ b/Documentation/git-fast-export.txt
@@ -105,6 +105,12 @@ marks the same across runs.
in the commit (as opposed to just listing the files which are
different from the commit's first parent).
 
+--anonymize::
+   Replace all paths, blob contents, commit and tag messages,
+   names, and email addresses in the output with anonymized data,
+   while still retaining the shape of history and of the stored
+   tree.
+
 --refspec::
Apply the specified refspec to each ref exported. Multiple of them can
be specified.
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 92b4624..acd2838 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -18,6 +18,7 @@
 #include parse-options.h
 #include quote.h
 #include remote.h
+#include blob.h
 
 static const char *fast_export_usage[] = {
N_(git fast-export [rev-list-opts]),
@@ -34,6 +35,7 @@ static int full_tree;
 static struct string_list extra_refs = STRING_LIST_INIT_NODUP;
 static struct refspec *refspecs;
 static int refspecs_nr;
+static int anonymize;
 
 static int parse_opt_signed_tag_mode(const struct option *opt,
 const char *arg, int unset)
@@ -81,6 +83,76 @@ static int has_unshown_parent(struct commit *commit)
return 0;
 }
 
+struct anonymized_entry {
+   struct hashmap_entry hash;
+   const char *orig;
+   size_t orig_len;
+   const char *anon;
+   size_t anon_len;
+};
+
+static int anonymized_entry_cmp(const void *va, const void *vb,
+   const void *data)
+{
+   const struct anonymized_entry *a = va, *b = vb;
+   return a-orig_len != b-orig_len ||
+   memcmp(a-orig, b-orig, a-orig_len);
+}
+
+/*
+ * Basically keep a cache of X-Y so that we can repeatedly replace
+ * the same anonymized string with another. The actual generation
+ * is farmed out to the generate function.
+ */
+static 

[PATCH v4] Allow the user to change the temporary file name for mergetool

2014-08-21 Thread Robin Rosenberg
Using the original filename suffix for the temporary input files to
the merge tool confuses IDEs like Eclipse. This patch introduces
a configurtion option, mergetool.tmpsuffix, which get appended to
the temporary file name. That way the user can choose to use a
suffix like .tmp, which does not cause confusion.

Signed-off-by: Robin Rosenberg robin.rosenb...@dewire.com
---
 Documentation/config.txt|  5 +
 Documentation/git-mergetool.txt |  7 +++
 git-mergetool.sh| 10 ++
 3 files changed, 18 insertions(+), 4 deletions(-)

Fixed a spelling error.

-- robin

diff --git a/Documentation/config.txt b/Documentation/config.txt
index c55c22a..0e15800 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1778,6 +1778,11 @@ notes.displayRef::
several times.  A warning will be issued for refs that do not
exist, but a glob that does not match any refs is silently
ignored.
+
+mergetool.tmpsuffix::
+   A string to append the names of the temporary files mergetool
+   creates in the worktree as input to a custom merge tool. The
+   primary use is to avoid confusion in IDEs during merge.
 +
 This setting can be overridden with the `GIT_NOTES_DISPLAY_REF`
 environment variable, which must be a colon separated list of refs or
diff --git a/Documentation/git-mergetool.txt b/Documentation/git-mergetool.txt
index e846c2e..80a0526 100644
--- a/Documentation/git-mergetool.txt
+++ b/Documentation/git-mergetool.txt
@@ -89,6 +89,13 @@ Setting the `mergetool.keepBackup` configuration variable to 
`false`
 causes `git mergetool` to automatically remove the backup as files
 are successfully merged.
 
+`git mergetool` may also create other temporary files for the
+different versions involved in the merge. By default these files have
+the same filename suffix as the file being merged. This may confuse
+other tools in use during a long merge operation. The user can set
+`mergetool.tmpsuffix` to be used as an extra suffix, which will be
+appened to the temporary filename to lessen that problem.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/git-mergetool.sh b/git-mergetool.sh
index 9a046b7..d7cc76c 100755
--- a/git-mergetool.sh
+++ b/git-mergetool.sh
@@ -214,6 +214,8 @@ checkout_staged_file () {
 }
 
 merge_file () {
+   tmpsuffix=$(git config mergetool.tmpsuffix || true)
+
MERGED=$1
 
f=$(git ls-files -u -- $MERGED)
@@ -229,10 +231,10 @@ merge_file () {
fi
 
ext=$$$(expr $MERGED : '.*\(\.[^/]*\)$')
-   BACKUP=./$MERGED.BACKUP.$ext
-   LOCAL=./$MERGED.LOCAL.$ext
-   REMOTE=./$MERGED.REMOTE.$ext
-   BASE=./$MERGED.BASE.$ext
+   BACKUP=./$MERGED.BACKUP.$ext$tmpsuffix
+   LOCAL=./$MERGED.LOCAL.$ext$tmpsuffix
+   REMOTE=./$MERGED.REMOTE.$ext$tmpsuffix
+   BASE=./$MERGED.BASE.$ext$tmpsuffix
 
base_mode=$(git ls-files -u -- $MERGED | awk '{if ($3==1) print $1;}')
local_mode=$(git ls-files -u -- $MERGED | awk '{if ($3==2) print 
$1;}')
-- 
2.1.0.rc2.6.g39c33ff.dirty

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2014, #03; Wed, 20)

2014-08-21 Thread Jeff King
On Wed, Aug 20, 2014 at 04:17:33PM -0700, Junio C Hamano wrote:

 * br/http-init-fix (2014-08-18) 2 commits
  - http: style fixes for curl_multi_init error check
  - http.c: die if curl_*_init fails
 
  Needs S-o-b from peff for the topmost one.

What you have queued looks good. Please add in:

  Signed-off-by: Jeff King p...@peff.net

 * jn/header-dependencies (2014-08-10) 1 commit
  - Update hard-coded header dependencies
 
  Needs further discussions on the list.

We could take Jonathan's patch in the meantime, which is a strict
improvement. I'll try to push the other part of the discussion forward.
If that reaches agreement quickly, we can do it instead of Jonathan's
patch rather than on top.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Update hard-coded header dependencies

2014-08-21 Thread Jeff King
On Sun, Aug 10, 2014 at 03:48:24PM -0400, Jeff King wrote:

 On Fri, Aug 08, 2014 at 02:58:26PM -0700, Jonathan Nieder wrote:
 
  Maybe it's worth switching to plain
  
  LIB_H += $(wildcard *.h)
  
  ?  People using ancient compilers that never change headers wouldn't
  be hurt, people using modern compilers that do change headers also
  wouldn't be hurt, and we could stop pretending to maintain an
  up-to-date list.
 [...]
 Maybe
 
   LIB_H += $(shell find . -name '*.h' -print)
 
 would work?

I took a stab at this and it seems to work. Here's a series.

  [1/2]: Makefile: use `find` to determine static header dependencies
  [2/2]: Makefile: drop CHECK_HEADER_DEPENDENCIES code

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Hello

2014-08-21 Thread Lee Wong


My name is Lee Wong,

I am contacting you beacuse a client I work for who pass away over 8 months ago 
did not make
a will.

I believe you may be the heir to their unclaimed estate, please send your full 
name so we can proceed
with this matter urgently and I can provided further details.

Thank you,

Lee Wong

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] Makefile: use find to determine static header dependencies

2014-08-21 Thread Jeff King
Most modern platforms will use automatically computed header
dependencies to figure out when a C file needs to be rebuilt
due to a header changing. With old compilers, however, we
fall back to a static list of header files. If any of them
changes, we recompile everything. This is overly
conservative, but the best we can do without compiler
support.

It is unfortunately easy for our static header list to grow
stale, as none of the regular developers make use of it.
Instead of trying to keep it up to date, let's invoke find
to generate the list dynamically.

We'd like to avoid running find at all when it is not
necessary, since it may add a non-trivial amount of time to
the build.  Make is _almost_ smart enough to avoid
evaluating the function when it is not necessary. For the
static header dependencies, we include $(LIB_H) as a
dependency only if COMPUTE_HEADER_DEPENDENCIES is turned
off, so we don't trigger its use there unless necessary. So
far so good.

However, we do always define $(LIB_H) as a dependency of
po/git.pot. Even though we do not actually try to build that
target, make will still evaluate the dependencies when
reading the Makefile, and expand the variable. This is not
ideal because almost nobody wants to build po/git.pot (only
the translation maintainer does it, and even then only once
or twice per release). We can hack around this by invoking a
sub-make which evaluates the variable only when po/git.pot
is actually being built.

Signed-off-by: Jeff King p...@peff.net
---
I also optimized the find a bit by pruning out some
directories that are almost certainly uninteresting. That
means we wouldn't catch an include of t/foo.h, but I think
that is probably an OK assumption.

I'm open to attempts to improve my ugly git.pot hack. I
thought at first it was caused by the use of := in
assigning LOCALIZED_C, but after much testing, I think it is
actually the expansion of the dependencies.

 Makefile | 143 ++-
 1 file changed, 13 insertions(+), 130 deletions(-)

diff --git a/Makefile b/Makefile
index 2320de5..08dd973 100644
--- a/Makefile
+++ b/Makefile
@@ -432,7 +432,6 @@ XDIFF_OBJS =
 VCSSVN_OBJS =
 GENERATED_H =
 EXTRA_CPPFLAGS =
-LIB_H =
 LIB_OBJS =
 PROGRAM_OBJS =
 PROGRAMS =
@@ -631,131 +630,11 @@ VCSSVN_LIB = vcs-svn/lib.a
 
 GENERATED_H += common-cmds.h
 
-LIB_H += advice.h
-LIB_H += archive.h
-LIB_H += argv-array.h
-LIB_H += attr.h
-LIB_H += bisect.h
-LIB_H += blob.h
-LIB_H += branch.h
-LIB_H += builtin.h
-LIB_H += bulk-checkin.h
-LIB_H += bundle.h
-LIB_H += cache-tree.h
-LIB_H += cache.h
-LIB_H += color.h
-LIB_H += column.h
-LIB_H += commit.h
-LIB_H += compat/bswap.h
-LIB_H += compat/mingw.h
-LIB_H += compat/obstack.h
-LIB_H += compat/poll/poll.h
-LIB_H += compat/precompose_utf8.h
-LIB_H += compat/terminal.h
-LIB_H += compat/win32/dirent.h
-LIB_H += compat/win32/pthread.h
-LIB_H += compat/win32/syslog.h
-LIB_H += connected.h
-LIB_H += convert.h
-LIB_H += credential.h
-LIB_H += csum-file.h
-LIB_H += decorate.h
-LIB_H += delta.h
-LIB_H += diff.h
-LIB_H += diffcore.h
-LIB_H += dir.h
-LIB_H += exec_cmd.h
-LIB_H += ewah/ewok.h
-LIB_H += ewah/ewok_rlw.h
-LIB_H += fetch-pack.h
-LIB_H += fmt-merge-msg.h
-LIB_H += fsck.h
-LIB_H += gettext.h
-LIB_H += git-compat-util.h
-LIB_H += gpg-interface.h
-LIB_H += graph.h
-LIB_H += grep.h
-LIB_H += hashmap.h
-LIB_H += help.h
-LIB_H += http.h
-LIB_H += kwset.h
-LIB_H += levenshtein.h
-LIB_H += line-log.h
-LIB_H += line-range.h
-LIB_H += list-objects.h
-LIB_H += ll-merge.h
-LIB_H += log-tree.h
-LIB_H += mailmap.h
-LIB_H += merge-blobs.h
-LIB_H += merge-recursive.h
-LIB_H += mergesort.h
-LIB_H += notes-cache.h
-LIB_H += notes-merge.h
-LIB_H += notes-utils.h
-LIB_H += notes.h
-LIB_H += object.h
-LIB_H += pack-objects.h
-LIB_H += pack-revindex.h
-LIB_H += pack.h
-LIB_H += pack-bitmap.h
-LIB_H += parse-options.h
-LIB_H += patch-ids.h
-LIB_H += pathspec.h
-LIB_H += pkt-line.h
-LIB_H += prio-queue.h
-LIB_H += progress.h
-LIB_H += prompt.h
-LIB_H += quote.h
-LIB_H += reachable.h
-LIB_H += reflog-walk.h
-LIB_H += refs.h
-LIB_H += remote.h
-LIB_H += rerere.h
-LIB_H += resolve-undo.h
-LIB_H += revision.h
-LIB_H += run-command.h
-LIB_H += send-pack.h
-LIB_H += sequencer.h
-LIB_H += sha1-array.h
-LIB_H += sha1-lookup.h
-LIB_H += shortlog.h
-LIB_H += sideband.h
-LIB_H += sigchain.h
-LIB_H += strbuf.h
-LIB_H += streaming.h
-LIB_H += string-list.h
-LIB_H += submodule.h
-LIB_H += tag.h
-LIB_H += tar.h
-LIB_H += thread-utils.h
-LIB_H += transport.h
-LIB_H += tree-walk.h
-LIB_H += tree.h
-LIB_H += unpack-trees.h
-LIB_H += unicode_width.h
-LIB_H += url.h
-LIB_H += urlmatch.h
-LIB_H += userdiff.h
-LIB_H += utf8.h
-LIB_H += varint.h
-LIB_H += vcs-svn/fast_export.h
-LIB_H += vcs-svn/line_buffer.h
-LIB_H += vcs-svn/repo_tree.h
-LIB_H += vcs-svn/sliding_window.h
-LIB_H += vcs-svn/svndiff.h
-LIB_H += vcs-svn/svndump.h
-LIB_H += walker.h
-LIB_H += wildmatch.h
-LIB_H += wt-status.h
-LIB_H += xdiff-interface.h
-LIB_H += 

[PATCH 2/2] Makefile: drop CHECK_HEADER_DEPENDENCIES code

2014-08-21 Thread Jeff King
This code was useful when we kept a static list of header
files, and it was easy to forget to update it. Since the last
commit, we generate the list dynamically.

Technically this could still be used to find a dependency
that our dynamic check misses (e.g., a header file without a
.h extension).  But that is reasonably unlikely to be
added, and even less likely to be noticed by this tool
(because it has to be run manually)., It is not worth
carrying around the cruft in the Makefile.

Signed-off-by: Jeff King p...@peff.net
---
I'm open to leaving this, as it's not hurting anything aside from
clutter, and it could possibly be used to cross-check the dynamic rule.
I just couldn't resist that all-deletion diffstat.

 Makefile | 59 ---
 1 file changed, 59 deletions(-)

diff --git a/Makefile b/Makefile
index 08dd973..65ff772 100644
--- a/Makefile
+++ b/Makefile
@@ -317,9 +317,6 @@ all::
 # dependency rules.  The default is auto, which means to use computed header
 # dependencies if your compiler is detected to support it.
 #
-# Define CHECK_HEADER_DEPENDENCIES to check for problems in the hard-coded
-# dependency rules.
-#
 # Define NATIVE_CRLF if your platform uses CRLF for line endings.
 #
 # Define XDL_FAST_HASH to use an alternative line-hashing method in
@@ -904,11 +901,6 @@ sysconfdir = etc
 endif
 endif
 
-ifdef CHECK_HEADER_DEPENDENCIES
-COMPUTE_HEADER_DEPENDENCIES = no
-USE_COMPUTED_HEADER_DEPENDENCIES =
-endif
-
 ifndef COMPUTE_HEADER_DEPENDENCIES
 COMPUTE_HEADER_DEPENDENCIES = auto
 endif
@@ -1809,29 +1801,13 @@ $(dep_dirs):
 missing_dep_dirs := $(filter-out $(wildcard $(dep_dirs)),$(dep_dirs))
 dep_file = $(dir $@).depend/$(notdir $@).d
 dep_args = -MF $(dep_file) -MQ $@ -MMD -MP
-ifdef CHECK_HEADER_DEPENDENCIES
-$(error cannot compute header dependencies outside a normal build. \
-Please unset CHECK_HEADER_DEPENDENCIES and try again)
-endif
 endif
 
 ifneq ($(COMPUTE_HEADER_DEPENDENCIES),yes)
-ifndef CHECK_HEADER_DEPENDENCIES
 dep_dirs =
 missing_dep_dirs =
 dep_args =
 endif
-endif
-
-ifdef CHECK_HEADER_DEPENDENCIES
-ifndef PRINT_HEADER_DEPENDENCIES
-missing_deps = $(filter-out $(notdir $^), \
-   $(notdir $(shell $(MAKE) -s $@ \
-   CHECK_HEADER_DEPENDENCIES=YesPlease \
-   USE_COMPUTED_HEADER_DEPENDENCIES=YesPlease \
-   PRINT_HEADER_DEPENDENCIES=YesPlease)))
-endif
-endif
 
 ASM_SRC := $(wildcard $(OBJECTS:o=S))
 ASM_OBJ := $(ASM_SRC:S=o)
@@ -1839,45 +1815,10 @@ C_OBJ := $(filter-out $(ASM_OBJ),$(OBJECTS))
 
 .SUFFIXES:
 
-ifdef PRINT_HEADER_DEPENDENCIES
-$(C_OBJ): %.o: %.c FORCE
-   echo $^
-$(ASM_OBJ): %.o: %.S FORCE
-   echo $^
-
-ifndef CHECK_HEADER_DEPENDENCIES
-$(error cannot print header dependencies during a normal build. \
-Please set CHECK_HEADER_DEPENDENCIES and try again)
-endif
-endif
-
-ifndef PRINT_HEADER_DEPENDENCIES
-ifdef CHECK_HEADER_DEPENDENCIES
-$(C_OBJ): %.o: %.c $(dep_files) FORCE
-   @set -e; echo CHECK $@; \
-   missing_deps=$(missing_deps); \
-   if test $$missing_deps; \
-   then \
-   echo missing dependencies: $$missing_deps; \
-   false; \
-   fi
-$(ASM_OBJ): %.o: %.S $(dep_files) FORCE
-   @set -e; echo CHECK $@; \
-   missing_deps=$(missing_deps); \
-   if test $$missing_deps; \
-   then \
-   echo missing dependencies: $$missing_deps; \
-   false; \
-   fi
-endif
-endif
-
-ifndef CHECK_HEADER_DEPENDENCIES
 $(C_OBJ): %.o: %.c GIT-CFLAGS $(missing_dep_dirs)
$(QUIET_CC)$(CC) -o $*.o -c $(dep_args) $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) 
$
 $(ASM_OBJ): %.o: %.S GIT-CFLAGS $(missing_dep_dirs)
$(QUIET_CC)$(CC) -o $*.o -c $(dep_args) $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) 
$
-endif
 
 %.s: %.c GIT-CFLAGS FORCE
$(QUIET_CC)$(CC) -o $@ -S $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $
-- 
2.1.0.346.ga0367b9
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Bug in Git 1.9.4 (20140815) for Windows - cannot clone from SVN

2014-08-21 Thread Reiner Nothdurft
Hi all,
 
I tried to move a repository from SVN to Git, but all my tries - on three
different machines running Windows 7 with the latest patches - failed with the
same reason. I am running the latest version of Git for Windows
1.9.4-preview-20140815. One of my first steps was to clone the repository from
my server into my local file system, which led to the following reproducible
error:
 
Command: git svn clone svn:jnc
C:\Program Files (x86)\Git\bin\perl.exe: *** unable to remap C:\Program Files
(x86)\Git\bin\libneon-25.dll to same addre ss as parent -- 0x85
      0 [main] perl.exe 1608 sync_with_child: child 10460(0x184) died before
initialization with status code 0x1
    748 [main] perl.exe 1608 sync_with_child: *** child state child loading
dlls C:\Program Files (x86)\Git\bin\perl.exe: *** unable to remap C:\Program
Files (x86)\Git\bin\libsvn_repos-1-0.dll to same  address as parent -- 0x85
5066339 [main] perl.exe 1608 sync_with_child: child 13188(0x198) died before
initialization with status code 0x1
5067125 [main] perl.exe 1608 sync_with_child: *** child state child loading
dlls
 
Same issue when I add parameters for the local path, trunk, branches, etc.
Moving back to Git 1.9.2 for Windows fixed this issue finally. Meanwhile I heard
from a collegue, that the issue also did not appear with 1.9.4 preview 20140611.

Hope this helps. Thanks for your great work!
Reiner Nothdurft
 
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug in Git 1.9.4 (20140815) for Windows - cannot clone from SVN

2014-08-21 Thread Thomas Braun
Am 21.08.2014 um 11:53 schrieb Reiner Nothdurft:
 Hi all,
  
 I tried to move a repository from SVN to Git, but all my tries - on three
 different machines running Windows 7 with the latest patches - failed with the
 same reason. I am running the latest version of Git for Windows
 1.9.4-preview-20140815. One of my first steps was to clone the repository from
 my server into my local file system, which led to the following reproducible
 error:
  
 Command: git svn clone svn:jnc
 C:\Program Files (x86)\Git\bin\perl.exe: *** unable to remap C:\Program Files
 (x86)\Git\bin\libneon-25.dll to same addre ss as parent -- 0x85
   0 [main] perl.exe 1608 sync_with_child: child 10460(0x184) died before
 initialization with status code 0x1
 748 [main] perl.exe 1608 sync_with_child: *** child state child loading
 dlls C:\Program Files (x86)\Git\bin\perl.exe: *** unable to remap C:\Program
 Files (x86)\Git\bin\libsvn_repos-1-0.dll to same  address as parent -- 
 0x85
 5066339 [main] perl.exe 1608 sync_with_child: child 13188(0x198) died before
 initialization with status code 0x1
 5067125 [main] perl.exe 1608 sync_with_child: *** child state child loading
 dlls
  
 Same issue when I add parameters for the local path, trunk, branches, etc.
 Moving back to Git 1.9.2 for Windows fixed this issue finally. Meanwhile I 
 heard
 from a collegue, that the issue also did not appear with 1.9.4 preview 
 20140611.

This was mentioned in the release notes, a fix is outlined in
https://github.com/msysgit/msysgit/pull/245.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[BUG] resolved deltas

2014-08-21 Thread Petr Stodulka

Hi guys,
I wanted post you patch here for this bug, but I can't find primary 
source of this problem [0], because I don't understand some ideas in the 
code. So what I investigate:


Bug is reprodusible since git version 1.8.3.1 (may earlier 1.8.xx, but I 
don't test it) to actual upstream version.
This problem doesn't exists in version 1.7.xx - or more precisely is 
not reproducible. May this is reproducible
since commit 7218a215 - in this commit was added assert in file 
builtin/index-pack.c (actual line is 918), but I didn't test this.


This assert tests if object's (child) real type == OBJ_REF_DELTA. This 
will failure for object with real_type == OBJ_TREE (set as parent's 
type) and type == OBJ_REF_DELTA. Here some prints of important variables 
before failure assert() (from older version but I think that values are 
still actual in this case):

--
(gdb) p base-ref_first
$9 = 3223

(gdb) p deltas[3223]
$10 = {
  base = {
sha1 = \274\070k\343K\324x\037q\273h\327*n\n\356\061$ \036,
offset = 2267795834784135356
  },
  obj_no = 11152
}

(gdb) p *child
$11 = {
  idx = {
sha1 = J\242i\251\261\273\305\067\236%CE\022\257\252\342[;\tD,
crc32 = 2811659028,
offset = 10392153
  },
  size = 30,
  hdr_size = 22,
  type = OBJ_REF_DELTA,
  real_type = OBJ_TREE,
  delta_depth = 0,
  base_object_no = 5458
}

(gdb) p objects[5458]
$13 = {
  idx = {
sha1 = \274\070k\343K\324x\037q\273h\327*n\n\356\061$ \036,
crc32 = 3724458534,
offset = 6879168
  },
  size = 236,
  hdr_size = 2,
  type = OBJ_TREE,
  real_type = OBJ_TREE,
  delta_depth = 0,
  base_object_no = 0
}
---

base_object_no is static 5458. base-ref_first child's object are 
dynamic. If you want stop process in same position my recommendation for 
gdb (if you use gdb) when you will be in file index-pack.c:

br 1093
cont
set variable nr_threads = 1
br 
cond 2 i == 6300
cont
br 916
cont
---
compilated without any optimisation, line numbers modified for commit 
6c4ab27f2378ce67940b4496365043119d72
Condition i == 6300 --- last iteration before failure has dynamic rank 
in range 6304 to 6309 in the most cases (that's weird for me, when count 
of downloaded objects is statically 12806, may wrong search of children?)

---

Here I am lost. I don't know really what I can do next here, because I 
don't understand some ideas in code. e.g. searching of child - functions 
find_delta(), find_delta_children(). Calculation on line 618:


int next = (first+last) / 2;

I still don't understand. I didn't find description of this searching 
algorithm in tech. documentation but I didn't read all yet. However I 
think that source of problems could be somewhere in these two functions. 
When child is found, its real_type is set to parent's type in function 
resolve_delta() on the line 865 and then lasts wait for failure. I don't 
think that problem is in repository itself [1], but it is possible.


Any next ideas/hints or explanation of these functions? I began study 
source code and mechanisms of the git this week, so don't beat me yet 
please :-)


Regards,
Petr

[0] https://bugzilla.redhat.com/show_bug.cgi?id=1099919
[1] git clone https://code.google.com/p/mapsforge/ mapsforge.git
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] send-pack: take refspecs over stdin

2014-08-21 Thread Jeff King
Pushing a large number of refs works over most transports,
because we implement send-pack as an internal function.
However, it can sometimes fail when pushing over http,
because we have to spawn git send-pack --stateless-rpc to
do the heavy lifting, and we pass each refspec on the
command line. This can cause us to overflow the OS limits on
the size of the command line for a large push.

We can solve this by giving send-pack a --stdin option and
using it from remote-curl.  We already dealt with this on
the fetch-pack side in 078b895 (fetch-pack: new --stdin
option to read refs from stdin, 2012-04-02). The stdin
option (and in particular, its use of packet-lines for
stateless-rpc input) is modeled after that solution.

Signed-off-by: Jeff King p...@peff.net
---
I had to fiddle with the numbers in the http test. Linux gives up to 1/4
of the configured stack ulimit as space for the cmdline, so I had to
pick a number big enough so that we had stack to run the actual
operation but small enough to limit the cmdline. The test as it is there
fails for me without this patch and succeeds with it. I suspect the
numbers are quite different on other systems, but I think it should at
least succeed everywhere with this patch. I'd also be fine with cutting
that test if it proves too flaky.

I tried originally bumping it to 50,000 tags to match the fetch-pack
test. But besides needing to protect it behind an EXPENSIVE prereq
(which means basically nobody is ever going to run it), it also seems to
trigger a nasty quadratic behavior in send-pack (6400 refs takes ~16s,
with the time quadrupling for each doubling of refs; the same operation
over a pipe takes 140ms).  Oddly, the behavior doesn't seem to trigger
when pushing over a local pipe, so it's presumably related to
stateless-rpc. It looked like were deep in match_refs, but I haven't
figured it out beyond that.

 Documentation/git-send-pack.txt | 13 -
 builtin/send-pack.c | 27 +++
 remote-curl.c   |  8 +++-
 t/t5541-http-push-smart.sh  | 15 +++
 4 files changed, 61 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-send-pack.txt b/Documentation/git-send-pack.txt
index dc3a568..2a0de42 100644
--- a/Documentation/git-send-pack.txt
+++ b/Documentation/git-send-pack.txt
@@ -35,6 +35,16 @@ OPTIONS
Instead of explicitly specifying which refs to update,
update all heads that locally exist.
 
+--stdin::
+   Take the list of refs from stdin, one per line. If there
+   are refs specified on the command line in addition to this
+   option, then the refs from stdin are processed after those
+   on the command line.
++
+If '--stateless-rpc' is specified together with this option then
+the list of refs must be in packet format (pkt-line). Each ref must
+be in a separate packet, and the list must end with a flush packet.
+
 --dry-run::
Do everything except actually send the updates.
 
@@ -77,7 +87,8 @@ this flag.
 Without '--all' and without any 'ref', the heads that exist
 both on the local side and on the remote side are updated.
 
-When one or more 'ref' are specified explicitly, it can be either a
+When one or more 'ref' are specified explicitly (whether on the
+command line or via `--stdin`), it can be either a
 single pattern, or a pair of such pattern separated by a colon
 : (this means that a ref name cannot have a colon in it).  A
 single pattern 'name' is just a shorthand for 'name:name'.
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index f420b74..4b1bc0f 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -110,6 +110,7 @@ int cmd_send_pack(int argc, const char **argv, const char 
*prefix)
int flags;
unsigned int reject_reasons;
int progress = -1;
+   int from_stdin = 0;
struct push_cas_option cas = {0};
 
argv++;
@@ -169,6 +170,10 @@ int cmd_send_pack(int argc, const char **argv, const char 
*prefix)
args.stateless_rpc = 1;
continue;
}
+   if (!strcmp(arg, --stdin)) {
+   from_stdin = 1;
+   continue;
+   }
if (!strcmp(arg, --helper-status)) {
helper_status = 1;
continue;
@@ -201,6 +206,28 @@ int cmd_send_pack(int argc, const char **argv, const char 
*prefix)
}
if (!dest)
usage(send_pack_usage);
+
+   if (from_stdin) {
+   struct argv_array all_refspecs = ARGV_ARRAY_INIT;
+
+   for (i = 0; i  nr_refspecs; i++)
+   argv_array_push(all_refspecs, refspecs[i]);
+
+   if (args.stateless_rpc) {
+   const char *buf;
+   while ((buf = packet_read_line(0, NULL)))
+ 

[PATCH v2] send-pack: take refspecs over stdin

2014-08-21 Thread Jeff King
On Thu, Aug 21, 2014 at 08:17:10AM -0400, Jeff King wrote:

  Documentation/git-send-pack.txt | 13 -
  builtin/send-pack.c | 27 +++
  remote-curl.c   |  8 +++-
  t/t5541-http-push-smart.sh  | 15 +++
  4 files changed, 61 insertions(+), 2 deletions(-)

Whoops. Forgot to actually add the battery of individual send-pack
tests. Here's a re-send.

-- 8 --
Subject: send-pack: take refspecs over stdin

Pushing a large number of refs works over most transports,
because we implement send-pack as an internal function.
However, it can sometimes fail when pushing over http,
because we have to spawn git send-pack --stateless-rpc to
do the heavy lifting, and we pass each refspec on the
command line. This can cause us to overflow the OS limits on
the size of the command line for a large push.

We can solve this by giving send-pack a --stdin option and
using it from remote-curl.  We already dealt with this on
the fetch-pack side in 078b895 (fetch-pack: new --stdin
option to read refs from stdin, 2012-04-02). The stdin
option (and in particular, its use of packet-lines for
stateless-rpc input) is modeled after that solution.

Signed-off-by: Jeff King p...@peff.net
---
 Documentation/git-send-pack.txt | 13 +-
 builtin/send-pack.c | 27 
 remote-curl.c   |  8 +++-
 t/t5408-send-pack-stdin.sh  | 92 +
 t/t5541-http-push-smart.sh  | 15 +++
 5 files changed, 153 insertions(+), 2 deletions(-)
 create mode 100755 t/t5408-send-pack-stdin.sh

diff --git a/Documentation/git-send-pack.txt b/Documentation/git-send-pack.txt
index dc3a568..2a0de42 100644
--- a/Documentation/git-send-pack.txt
+++ b/Documentation/git-send-pack.txt
@@ -35,6 +35,16 @@ OPTIONS
Instead of explicitly specifying which refs to update,
update all heads that locally exist.
 
+--stdin::
+   Take the list of refs from stdin, one per line. If there
+   are refs specified on the command line in addition to this
+   option, then the refs from stdin are processed after those
+   on the command line.
++
+If '--stateless-rpc' is specified together with this option then
+the list of refs must be in packet format (pkt-line). Each ref must
+be in a separate packet, and the list must end with a flush packet.
+
 --dry-run::
Do everything except actually send the updates.
 
@@ -77,7 +87,8 @@ this flag.
 Without '--all' and without any 'ref', the heads that exist
 both on the local side and on the remote side are updated.
 
-When one or more 'ref' are specified explicitly, it can be either a
+When one or more 'ref' are specified explicitly (whether on the
+command line or via `--stdin`), it can be either a
 single pattern, or a pair of such pattern separated by a colon
 : (this means that a ref name cannot have a colon in it).  A
 single pattern 'name' is just a shorthand for 'name:name'.
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index f420b74..4b1bc0f 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -110,6 +110,7 @@ int cmd_send_pack(int argc, const char **argv, const char 
*prefix)
int flags;
unsigned int reject_reasons;
int progress = -1;
+   int from_stdin = 0;
struct push_cas_option cas = {0};
 
argv++;
@@ -169,6 +170,10 @@ int cmd_send_pack(int argc, const char **argv, const char 
*prefix)
args.stateless_rpc = 1;
continue;
}
+   if (!strcmp(arg, --stdin)) {
+   from_stdin = 1;
+   continue;
+   }
if (!strcmp(arg, --helper-status)) {
helper_status = 1;
continue;
@@ -201,6 +206,28 @@ int cmd_send_pack(int argc, const char **argv, const char 
*prefix)
}
if (!dest)
usage(send_pack_usage);
+
+   if (from_stdin) {
+   struct argv_array all_refspecs = ARGV_ARRAY_INIT;
+
+   for (i = 0; i  nr_refspecs; i++)
+   argv_array_push(all_refspecs, refspecs[i]);
+
+   if (args.stateless_rpc) {
+   const char *buf;
+   while ((buf = packet_read_line(0, NULL)))
+   argv_array_push(all_refspecs, buf);
+   } else {
+   struct strbuf line = STRBUF_INIT;
+   while (strbuf_getline(line, stdin, '\n') != EOF)
+   argv_array_push(all_refspecs, line.buf);
+   strbuf_release(line);
+   }
+
+   refspecs = all_refspecs.argv;
+   nr_refspecs = all_refspecs.argc;
+   }
+
/*
 * --all and --mirror are incompatible; neither makes sense
 * 

Re: Re: Re: Relative submodule URLs

2014-08-21 Thread Heiko Voigt
On Wed, Aug 20, 2014 at 08:18:12AM -0500, Robert Dailey wrote:
 On Tue, Aug 19, 2014 at 3:57 PM, Heiko Voigt hvo...@hvoigt.net wrote:
  I would actually error out when specified in already cloned state.
  Because otherwise the user might expect the remote to be updated.
 
  Since we are currently busy implementing recursive fetch and checkout I have
  added that to our ideas list[1] so we do not forget about it.
 
  In the meantime you can either use the branch.name.remote
  configuration to define a remote to use or just use 'origin'.
 
  Cheers Heiko
 
  [1] 
  https://github.com/jlehmann/git-submod-enhancements/wiki#add-with-remote--switch-to-submodule-update
 
 Thanks Heiko.
 
 I would offer to help implement this for you, if you find it to be a
 good idea, but I've never done git development before and based on
 what I've seen it seems like you need to know at least 2-3 languages
 to contribute: bash, perl, C++. I know C++  Python but I don't know
 perl or bash scripting language.
 
 What would it take to help you guys out? It's easy to complain  file
 bugs but as a developer I feel like I should offer more, if it suits
 you.

For this particular case shell scripting should be sufficient. And it
should not take too much time. Have a look at the git-submodule.sh
script in the repository. That is the one implementing the git submodule
command.

Additionally you need to extend the documentation and write a test or
two. Writing a test is also done in shell script. The documentation[1] is
in asciidoc which is pretty self explanatory.

The test should probably go into t/t7406-submodule-update.sh and, as
Phil pointed out, in t7403-submodule-sync.sh).

Also make sure to read the shell scripting part in
Documentation/CodingGuidelines and as a general rule: Keep close to the
style you find in the file. And when you are ready to send a patch:
Documentation/SubmittingPatches.

If you are happy but unsure about anything just send a patch with your
implementation (CC me and everyone involved) and we will discuss it here
on the list.

Cheers Heiko

[1] Documentation/git-submodule.txt
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2014, #03; Wed, 20)

2014-08-21 Thread Heiko Voigt
On Wed, Aug 20, 2014 at 04:17:33PM -0700, Junio C Hamano wrote:
 * hv/submodule-config (2014-06-30) 4 commits
   (merged to 'next' on 2014-07-17 at 5e0ce45)
  + do not die on error of parsing fetchrecursesubmodules option
  + use new config API for worktree configurations of submodules
  + extract functions for submodule config set and lookup
  + implement submodule config cache for lookup of submodule names
 
  Will cook in 'next'.

While using the config API for implementing my recursive fetch. I
discovered a bug in my API here. In submodule_from_name() the lookup of
the gitmodule sha1 is missing. So currently you would have to pass in
the gitmodule sha1 instead of the commit sha1 as documented. I will
extend the test and fix this.

Cheers Heiko
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] Handling unmerged files with merged entries

2014-08-21 Thread Jaime Soriano Pastor
Good points.

On Thu, Aug 21, 2014 at 12:19 AM, Junio C Hamano gits...@pobox.com wrote:
 After looking at what you did in 1/4, I started to wonder if we can
 solve this in add_index_entry_with_check() in a less intrusive way.
 When we call the function with a stage #0 entry, we are telling the
 index that any entry in higher stage for the same path must
 disappear.  Since the current implementation of the function assumes
 that the index is not corrupt in this particular way to have both
 merged and unmerged entries for the same path, it fails to remove
 the higher stage entries.  If we fix the function, wouldn't it make
 your 1/4 unnecessary?  Read-only operations such as ls-files -s
 would not call add_index_entry() so diagnostic tools would not be
 affected even with such a fix.

Another thing that is done in 1/4 is to get rid of the call to
index_name_pos, that can lead to infinite loops depending on what the
previous add_index_entry call does as we have seen, and I wonder why
is it really needed, specially if we guarantee the order in the index.

 ... which may look something like the one attached at the end.

And it would be more in the line of my first patch.

 But then it made me wonder even more.

 There are other ways a piece of software can leave a corrupt index
 for us to read from.  Your fix, or the simpler one I suggested for
 that matter, would still assume that the index entries are in the
 sorted order, and a corrupt index that does not sort its entries
 correctly will cause us to behave in an undefined way.  At some
 point we should draw a line and say Your index is hopelessly
 corrupt., send it back to whatever broken software that originally
 wrote such a mess and have the user use that software to fix the
 corrupt index up before talking to us.

True.

 For that, we need to catch an index whose entries are not sorted and
 error out, perhaps when read_index_from() iterates over the mmapped
 index entries.  We can even draw that hopelessly corrupt line
 above the breakage you are addressing and add a check to make sure
 no path has both merged and unmerged entries to the same check to
 make it error out.

 I suspect that such a detect and error out may be sufficient and
 also may be more robust than the approach that assumes that a
 breakage is only to have both merged and unmerged entries for the
 same path, the entries are still correctly sorted.

Agree. I have prepared an initial patch for this to discuss, but
adding checks in read_index_from() can add a small(?) penalization to
all git operations, specially with big indexes.
And it wouldn't probably allow the user to fix the repository using
git commands (unless we only warn instead of die depending on the
thing that is broken).
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Check order when reading index

2014-08-21 Thread Jaime Soriano Pastor
Signed-off-by: Jaime Soriano Pastor jsorianopas...@gmail.com
---
 read-cache.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 7f5645e..e117d3a 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1438,6 +1438,21 @@ static struct cache_entry *create_from_disk(struct 
ondisk_cache_entry *ondisk,
return ce;
 }
 
+void check_next_ce(struct cache_entry *ce, struct cache_entry *next_ce) {
+   if (!ce || !next_ce)
+   return;
+   if (cache_name_compare(ce-name, ce_namelen(ce),
+  next_ce-name, 
ce_namelen(next_ce))  1)
+   die(Unordered stage entries in index);
+   if (ce_same_name(ce, next_ce)) {
+   if (!ce_stage(ce))
+   die(Multiple stage entries for merged file '%s',
+   ce-name);
+   if (ce_stage(ce) = ce_stage(next_ce))
+   die(Unordered stage entries for '%s', ce-name);
+   }
+}
+
 /* remember to discard_cache() before reading a different cache! */
 int read_index_from(struct index_state *istate, const char *path)
 {
@@ -1499,6 +1514,9 @@ int read_index_from(struct index_state *istate, const 
char *path)
ce = create_from_disk(disk_ce, consumed, previous_name);
set_index_entry(istate, i, ce);
 
+   if (i  0)
+   check_next_ce(istate-cache[i-1], ce);
+
src_offset += consumed;
}
strbuf_release(previous_name_buf);
-- 
2.0.4.dirty

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Check order when reading index

2014-08-21 Thread Jaime Soriano Pastor
On Thu, Aug 21, 2014 at 3:43 PM, Jaime Soriano Pastor
jsorianopas...@gmail.com wrote:
 +   if (!ce_stage(ce))
 +   die(Multiple stage entries for merged file '%s',
 +   ce-name);

This case can be provoked by git update-index --index-info as shown
in the patch with the added test, maybe it should be only a warning.
And add too some variation of the patches in this thread to make the
same command able to fix the situation.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] Added tests for the case of merged and unmerged entries for the same file

2014-08-21 Thread Jaime Soriano Pastor
On Wed, Aug 20, 2014 at 11:00 PM, Junio C Hamano gits...@pobox.com wrote:
 Jaime Soriano Pastor jsorianopas...@gmail.com writes:

 Signed-off-by: Jaime Soriano Pastor jsorianopas...@gmail.com
 ---
  t/t9904-unmerged-file-with-merged-entry.sh | 86 
 ++

 Isn't this number already used for another test?  A test on the
 index probably belongs to t2XXX or t3XXX family.

Umm, I though this test number was free, I just added it to the last+1
position, if I finally add a test I'll take this into account. Thanks.

  1 file changed, 86 insertions(+)
  create mode 100755 t/t9904-unmerged-file-with-merged-entry.sh

 diff --git a/t/t9904-unmerged-file-with-merged-entry.sh 
 b/t/t9904-unmerged-file-with-merged-entry.sh
 new file mode 100755
 index 000..945bc1c
 --- /dev/null
 +++ b/t/t9904-unmerged-file-with-merged-entry.sh
 @@ -0,0 +1,86 @@
 +#!/bin/sh
 +
 +test_description='Operations with unmerged files with merged entries'
 +
 +. ./test-lib.sh
 +
 +setup_repository() {
 + test_commit A conflict A
 + test_commit A conflict2 A2 branchbase
 + test_commit B conflict B
 + test_commit B conflict2 B2
 + git checkout branchbase -b branch1
 + test_commit C conflict C
 + test_commit C conflict2 C2
 + test_commit something otherfile otherfile
 +}

 No error is checked here?

This is only a helper function for setup, not a test itself.

 +setup_stage_state() {
 + git checkout -f HEAD
 + {
 + git ls-files -s conflict conflict2
 + git merge master  /dev/null
 + git ls-files -s conflict conflict2
 + }  index

 No error is checked here?

Same here.

 Style: no SP between redirection operator and its target, i.e.

 git merge master /dev/null
 { ... } index

 + cat index | git update-index --index-info

 Do not cat a single file into a pipeline, i.e.

 git update-index --index-info index

True :) Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] git update-index --cacheinfo can be used to select a stage when there are merged and unmerged entries

2014-08-21 Thread Jaime Soriano Pastor
On Wed, Aug 20, 2014 at 11:08 PM, Junio C Hamano gits...@pobox.com wrote:
 Jaime Soriano Pastor jsorianopas...@gmail.com writes:

 Subject: Re: [PATCH 4/4] git update-index --cacheinfo can be used to select
  a stage when there are merged and unmerged entries

 Hmph, what does it even mean?  Shared with your [1/4] is that it is
 unclear if you are stating an existing problem to be fixed or
 describing the desired end result.

 Also update-index --cacheinfo is not about selecting but is
 about stuffing an entry to the index, so can be used to select
 is doubly puzzling...

Well, somehow I understand update-index --cacheinfo as a low level
version of add. I was trying to explain the desired end result, yes.

   ...
 +test_expect_success 'git update-index --cacheinfo to select a stage to use' 
 '
 + setup_stage_state 
 + git cat-file blob :1:conflict  conflict 

 Style: no SP between redirection and its target.

Ok.

 + git update-index --cacheinfo 100644,`git hash-object conflict`,conflict

 Style: we prefer $() over ``

Ok.

 + git ls-files -s conflict  output 
 + test_line_count = 1 output

 Is we have only one line the only thing we care about?  Don't we
 want to check which stage the entry is at?

Yes, it'd be better.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Check order when reading index

2014-08-21 Thread Duy Nguyen
On Thu, Aug 21, 2014 at 8:43 PM, Jaime Soriano Pastor
jsorianopas...@gmail.com wrote:
 @@ -1499,6 +1514,9 @@ int read_index_from(struct index_state *istate, const 
 char *path)
 ce = create_from_disk(disk_ce, consumed, previous_name);
 set_index_entry(istate, i, ce);

 +   if (i  0)
 +   check_next_ce(istate-cache[i-1], ce);
 +
 src_offset += consumed;
 }
 strbuf_release(previous_name_buf);

It may be nice to save the good index stamp as an index extension so
we don't have to check this over and over. I'm thinking about big
indexes where compare cost might matter (I'm not so sure yet, will do
some testing when I have time).
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v13 11/11] Documentation: add documentation for 'git interpret-trailers'

2014-08-21 Thread Marc Branchaud
On 14-08-20 11:39 PM, Christian Couder wrote:
 On Thu, Aug 21, 2014 at 12:05 AM, Marc Branchaud marcn...@xiplink.com wrote:
 On 14-08-16 12:06 PM, Christian Couder wrote:
 +
 +* after them it's only possible to have some lines that contain only
 +  spaces, and then a patch; the patch part is recognized using the
 +  fact that its first line starts with '---' (three minus signs),

 Is that starts with or consists solely of?
 
 It is starts with. (The starts_with() function is used.)

Wouldn't it be more robust to do it the other way?  I can imagine cases when
a human might want to start a line of text with ---, whereas we can make
sure that git tools always use a plain --- line with no extra text.

Not a big deal either way though.  Thanks for working on this!

M.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Makefile: use find to determine static header dependencies

2014-08-21 Thread Jonathan Nieder
Hi,

Jeff King wrote:

 However, we do always define $(LIB_H) as a dependency of
 po/git.pot. Even though we do not actually try to build that
 target, make will still evaluate the dependencies when
 reading the Makefile, and expand the variable. This is not
 ideal

Would the following work?  The current dependencies for po/git.pot are
not correct anyway --- they include LOCALIZED_C but not LOCALIZED_SH
or LOCALIZED_PERL, so someone hacking on shell scripts and then trying
'make po/git.pot' could end up with the pot file not being
regenerated.

-- 8 --
Subject: i18n: treat make pot as an explicitly-invoked target

po/git.pot is normally used as-is and not regenerated by people
building git, so it is okay if an explicit make po/git.pot always
automatically regenerates it.  Depend on the magic FORCE target
instead of explicitly keeping track of dependencies.

This simplifies the makefile, in particular preparing for a moment
when $(LIB_H), which is part of $(LOCALIZED_C), can be computed on the
fly.

We still need a dependency on GENERATED_H, to force those files to be
built when regenerating git.pot.

Signed-off-by: Jonathan Nieder jrnie...@gmail.com
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 2320de5..cf0ccdf 100644
--- a/Makefile
+++ b/Makefile
@@ -2138,7 +2138,7 @@ LOCALIZED_SH += t/t0200/test.sh
 LOCALIZED_PERL += t/t0200/test.perl
 endif
 
-po/git.pot: $(LOCALIZED_C)
+po/git.pot: $(GENERATED_H) FORCE
$(QUIET_XGETTEXT)$(XGETTEXT) -o$@+ $(XGETTEXT_FLAGS_C) $(LOCALIZED_C)
$(QUIET_XGETTEXT)$(XGETTEXT) -o$@+ --join-existing $(XGETTEXT_FLAGS_SH) 
\
$(LOCALIZED_SH)
-- 
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Shallow clones with explicit history cutoff?

2014-08-21 Thread Matthias Urlichs
Hi,

use case: I am packaging the FOO program for Debian. FOO is maintained in
git but it has a bunch of problems (e.g. because somebody mistakenly checked
in a huge blob which would give the ).

The current workflow for this is to create a new branch, remove the
offending bits if necessary, create a FOO-clean.tar.xz file, and ship that
as original source. I find that to be suboptimal.

What I would like to have, instead, is a version of shallow cloning which
cuts off not at a pre-determined depth, but at a given branch (or set of
branches). In other words, given

+-J--K  (packaged)
   //
  +-F--G--HI(clean)
 /   /
A---B---C---D---E   (upstream)

a command git clone --shallow-until upstream $REPO (or however that would
be named) would create a shallow git archive which contains branches
packaged+clean, with commits FGHIJK. In contrast, with --single-branch and
--depth 4 I would get CGHIJK, which isn't what I'd want.

As I have not spent too much time with the git sources lately (as in None
at all), some pointers where to start implementing this would be
appreciated, assuming (a) this has a reasonable chance of landing in git and
(b) nobody beats me to it. ;-)

-- 
-- Matthias Urlichs

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/3] convert: Refactor would_convert_to_git() to single arg 'path'

2014-08-21 Thread Steffen Prohaska
It is only the path that matters in the decision whether to filter or
not.  Clarify this by making path the single argument of
would_convert_to_git().

Signed-off-by: Steffen Prohaska proha...@zib.de
---
 convert.h   | 5 ++---
 sha1_file.c | 2 +-
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/convert.h b/convert.h
index 0c2143c..c638b33 100644
--- a/convert.h
+++ b/convert.h
@@ -40,10 +40,9 @@ extern int convert_to_working_tree(const char *path, const 
char *src,
   size_t len, struct strbuf *dst);
 extern int renormalize_buffer(const char *path, const char *src, size_t len,
  struct strbuf *dst);
-static inline int would_convert_to_git(const char *path, const char *src,
-  size_t len, enum safe_crlf checksafe)
+static inline int would_convert_to_git(const char *path)
 {
-   return convert_to_git(path, src, len, NULL, checksafe);
+   return convert_to_git(path, NULL, 0, NULL, 0);
 }
 
 /*
diff --git a/sha1_file.c b/sha1_file.c
index 3f70b1d..00c07f2 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3144,7 +3144,7 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st,
if (!S_ISREG(st-st_mode))
ret = index_pipe(sha1, fd, type, path, flags);
else if (size = big_file_threshold || type != OBJ_BLOB ||
-(path  would_convert_to_git(path, NULL, 0, 0)))
+(path  would_convert_to_git(path)))
ret = index_core(sha1, fd, size, type, path, flags);
else
ret = index_stream(sha1, fd, size, type, path, flags);
-- 
2.1.0.6.gb452461

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 2/3] Introduce GIT_MMAP_LIMIT to allow testing expected mmap size

2014-08-21 Thread Steffen Prohaska
Similar to testing expectations about malloc with GIT_ALLOC_LIMIT (see
commit d41489), it can be useful to test expectations about mmap.

This introduces a new environment variable GIT_MMAP_LIMIT to limit the
largest allowed mmap length (in KB).  xmmap() is modified to check the
limit.  Together with GIT_ALLOC_LIMIT tests can now easily confirm
expectations about memory consumption.

GIT_ALLOC_LIMIT will be used in the next commit to test that data will
be streamed to an external filter without mmaping the entire file.

[commit d41489]: d41489a6424308dc9a0409bc2f6845aa08bd4f7d Add more large
blob test cases

Signed-off-by: Steffen Prohaska proha...@zib.de
---
 sha1_file.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/sha1_file.c b/sha1_file.c
index 00c07f2..88d64c0 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -663,10 +663,25 @@ void release_pack_memory(size_t need)
; /* nothing */
 }
 
+static void mmap_limit_check(size_t length)
+{
+   static int limit = -1;
+   if (limit == -1) {
+   const char *env = getenv(GIT_MMAP_LIMIT);
+   limit = env ? atoi(env) * 1024 : 0;
+   }
+   if (limit  length  limit)
+   die(attempting to mmap %PRIuMAX over limit %d,
+   (intmax_t)length, limit);
+}
+
 void *xmmap(void *start, size_t length,
int prot, int flags, int fd, off_t offset)
 {
-   void *ret = mmap(start, length, prot, flags, fd, offset);
+   void *ret;
+
+   mmap_limit_check(length);
+   ret = mmap(start, length, prot, flags, fd, offset);
if (ret == MAP_FAILED) {
if (!length)
return NULL;
-- 
2.1.0.6.gb452461

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 3/3] convert: Stream from fd to required clean filter instead of mmap

2014-08-21 Thread Steffen Prohaska
The data is streamed to the filter process anyway.  Better avoid mapping
the file if possible.  This is especially useful if a clean filter
reduces the size, for example if it computes a sha1 for binary data,
like git media.  The file size that the previous implementation could
handle was limited by the available address space; large files for
example could not be handled with (32-bit) msysgit.  The new
implementation can filter files of any size as long as the filter output
is small enough.

The new code path is only taken if the filter is required.  The filter
consumes data directly from the fd.  The original data is not available
to git, so it must fail if the filter fails.

The environment variable GIT_MMAP_LIMIT, which has been introduced in
the previous commit is used to test that the expected code path is
taken.  A related test that exercises required filters is modified to
verify that the data actually has been modified on its way from the file
system to the object store.

Signed-off-by: Steffen Prohaska proha...@zib.de
---
 convert.c | 60 +--
 convert.h |  5 +
 sha1_file.c   | 27 ++-
 t/t0021-conversion.sh | 24 -
 4 files changed, 104 insertions(+), 12 deletions(-)

diff --git a/convert.c b/convert.c
index cb5fbb4..463f6de 100644
--- a/convert.c
+++ b/convert.c
@@ -312,11 +312,12 @@ static int crlf_to_worktree(const char *path, const char 
*src, size_t len,
 struct filter_params {
const char *src;
unsigned long size;
+   int fd;
const char *cmd;
const char *path;
 };
 
-static int filter_buffer(int in, int out, void *data)
+static int filter_buffer_or_fd(int in, int out, void *data)
 {
/*
 * Spawn cmd and feed the buffer contents through its stdin.
@@ -325,6 +326,7 @@ static int filter_buffer(int in, int out, void *data)
struct filter_params *params = (struct filter_params *)data;
int write_err, status;
const char *argv[] = { NULL, NULL };
+   int fd;
 
/* apply % substitution to cmd */
struct strbuf cmd = STRBUF_INIT;
@@ -355,7 +357,17 @@ static int filter_buffer(int in, int out, void *data)
 
sigchain_push(SIGPIPE, SIG_IGN);
 
-   write_err = (write_in_full(child_process.in, params-src, params-size) 
 0);
+   if (params-src) {
+   write_err = (write_in_full(child_process.in, params-src, 
params-size)  0);
+   } else {
+   /* dup(), because copy_fd() closes the input fd. */
+   fd = dup(params-fd);
+   if (fd  0)
+   write_err = error(failed to dup file descriptor.);
+   else
+   write_err = copy_fd(fd, child_process.in);
+   }
+
if (close(child_process.in))
write_err = 1;
if (write_err)
@@ -371,7 +383,7 @@ static int filter_buffer(int in, int out, void *data)
return (write_err || status);
 }
 
-static int apply_filter(const char *path, const char *src, size_t len,
+static int apply_filter(const char *path, const char *src, size_t len, int fd,
 struct strbuf *dst, const char *cmd)
 {
/*
@@ -392,11 +404,12 @@ static int apply_filter(const char *path, const char 
*src, size_t len,
return 1;
 
memset(async, 0, sizeof(async));
-   async.proc = filter_buffer;
+   async.proc = filter_buffer_or_fd;
async.data = params;
async.out = -1;
params.src = src;
params.size = len;
+   params.fd = fd;
params.cmd = cmd;
params.path = path;
 
@@ -747,6 +760,24 @@ static void convert_attrs(struct conv_attrs *ca, const 
char *path)
}
 }
 
+int would_convert_to_git_filter_fd(const char *path)
+{
+   struct conv_attrs ca;
+
+   convert_attrs(ca, path);
+   if (!ca.drv)
+   return 0;
+
+   /* Apply a filter to an fd only if the filter is required to succeed.
+* We must die if the filter fails, because the original data before
+* filtering is not available.
+*/
+   if (!ca.drv-required)
+   return 0;
+
+   return apply_filter(path, NULL, 0, -1, NULL, ca.drv-clean);
+}
+
 int convert_to_git(const char *path, const char *src, size_t len,
struct strbuf *dst, enum safe_crlf checksafe)
 {
@@ -761,7 +792,7 @@ int convert_to_git(const char *path, const char *src, 
size_t len,
required = ca.drv-required;
}
 
-   ret |= apply_filter(path, src, len, dst, filter);
+   ret |= apply_filter(path, src, len, -1, dst, filter);
if (!ret  required)
die(%s: clean filter '%s' failed, path, ca.drv-name);
 
@@ -778,6 +809,23 @@ int convert_to_git(const char *path, const char *src, 
size_t len,
return ret | ident_to_git(path, src, len, dst, ca.ident);
 }
 
+void convert_to_git_filter_fd(const char *path, int fd, struct strbuf *dst,
+

[PATCH v3 0/3] Stream fd to clean filter, GIT_MMAP_LIMIT

2014-08-21 Thread Steffen Prohaska
I revised the testing approach as discussed.  Patch 2/3 adds GIT_MMAP_LIMIT,
which allows testing of memory expectations together with GIT_ALLOC_LIMIT.

The rest is unchanged compared to v2.

Steffen Prohaska (3):
  convert: Refactor would_convert_to_git() to single arg 'path'
  Introduce GIT_MMAP_LIMIT to allow testing expected mmap size
  convert: Stream from fd to required clean filter instead of mmap

 convert.c | 60 +--
 convert.h | 10 ++---
 sha1_file.c   | 46 ---
 t/t0021-conversion.sh | 24 -
 4 files changed, 123 insertions(+), 17 deletions(-)

-- 
2.1.0.6.gb452461

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2014, #03; Wed, 20)

2014-08-21 Thread Junio C Hamano
Heiko Voigt hvo...@hvoigt.net writes:

 On Wed, Aug 20, 2014 at 04:17:33PM -0700, Junio C Hamano wrote:
 * hv/submodule-config (2014-06-30) 4 commits
   (merged to 'next' on 2014-07-17 at 5e0ce45)
  + do not die on error of parsing fetchrecursesubmodules option
  + use new config API for worktree configurations of submodules
  + extract functions for submodule config set and lookup
  + implement submodule config cache for lookup of submodule names
 
  Will cook in 'next'.

 While using the config API for implementing my recursive fetch. I
 discovered a bug in my API here. In submodule_from_name() the lookup of
 the gitmodule sha1 is missing. So currently you would have to pass in
 the gitmodule sha1 instead of the commit sha1 as documented. I will
 extend the test and fix this.

OK, I do not mind temporarily kicking this back to 'pu', so that you
can replace these wholesale instead of doing an incremental patch on
top, when we rewind 'next' in a few days.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: cherry picking and merge

2014-08-21 Thread Keller, Jacob E
On Fri, 2014-08-01 at 09:56 -0700, Mike Stump wrote:
 Since everything I do goes up and down into repositories and I don’t want my 
 friends and family to scorn me, rebase isn’t the command I want to use.

You completely mis-understand what published means. Published history
is history from which other people can pull right now.

That means it has to be in a publicly addressable repository (ie: just
like the remote that you are pulling from as upstream).

rebasing commits which are already in the upstream is bad. Rebasing
commits which you have created locally is NOT bad. These commits would
not be published until you do a push.

This is the fundamental issue with rebase, and it is infact easy to
avoid mis-using, especially if you don't publish changes. The key is
that a commit isn't published until it's something someone else can
depend on.

Doing git pull --rebase essentially doesn't ever get you into trouble.

Regards,
Jake


Re: cherry picking and merge

2014-08-21 Thread Keller, Jacob E
On Thu, 2014-08-21 at 17:36 +, Keller, Jacob E wrote:
 On Fri, 2014-08-01 at 09:56 -0700, Mike Stump wrote:
  Since everything I do goes up and down into repositories and I don’t want 
  my friends and family to scorn me, rebase isn’t the command I want to use.
 
 You completely mis-understand what published means. Published history
 is history from which other people can pull right now.
 
 That means it has to be in a publicly addressable repository (ie: just
 like the remote that you are pulling from as upstream).
 
 rebasing commits which are already in the upstream is bad. Rebasing
 commits which you have created locally is NOT bad. These commits would
 not be published until you do a push.
 
 This is the fundamental issue with rebase, and it is infact easy to
 avoid mis-using, especially if you don't publish changes. The key is
 that a commit isn't published until it's something someone else can
 depend on.
 
 Doing git pull --rebase essentially doesn't ever get you into trouble.
 
 Regards,
 Jake
 �{.n�+���+%��lzwm��b�맲��r��z��{ay�ʇڙ�,j��f���h���z��w���
 ���j:+v���w�j�mzZ+�ݢj��!�i

Pardon me. You can actually ignore this post. I read through more of the
thread, and actually realize I completely misunderstood what your issue
was, and why rebase might not work.

Regards,
Jake


Re: [BUG] resolved deltas

2014-08-21 Thread Petr Stodulka



snip
Bug is reprodusible since git version 1.8.3.1 (may earlier 1.8.xx, but 
I don't test it) to actual upstream version.
This problem doesn't exists in version 1.7.xx - or more precisely is 
not reproducible. May this is reproducible
since commit 7218a215 - in this commit was added assert in file 
builtin/index-pack.c (actual line is 918), but I didn't test this.

Ok so this is reproducible since this commit because of assert().

Here I am lost. I don't know really what I can do next here, because I 
don't understand some ideas in code. e.g. searching of child - 
functions find_delta(), find_delta_children(). Calculation on line 618:


int next = (first+last) / 2;

I still don't understand. I didn't find description of this searching 
algorithm in tech. documentation but I didn't read all yet. However I 
think that source of problems could be somewhere in these two 
functions. When child is found, its real_type is set to parent's type 
in function resolve_delta() on the line 865 and then lasts wait for 
failure. I don't think that problem is in repository itself [1], but 
it is possible.
I read history of commits and my idea seems to be incorrect. It seems 
more like some error in repository itself. But I'd rather get opinion 
from someone who knows this code and ideas better.


Regards,
Petr


[0] https://bugzilla.redhat.com/show_bug.cgi?id=1099919
[1] git clone https://code.google.com/p/mapsforge/ mapsforge.git


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] sha1_name: avoid quadratic list insertion in handle_one_ref

2014-08-21 Thread René Scharfe
Similar to 16445242 (fetch-pack: avoid quadratic list insertion in
mark_complete), sort only after all refs are collected instead of while
inserting.  The result is the same, but it's more efficient that way.
The difference will only be measurable in repositories with a large
number of refs.

Signed-off-by: Rene Scharfe l@web.de
---
 sha1_name.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/sha1_name.c b/sha1_name.c
index 63ee66f..7098b10 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -839,7 +839,7 @@ static int handle_one_ref(const char *path,
}
if (object-type != OBJ_COMMIT)
return 0;
-   commit_list_insert_by_date((struct commit *)object, list);
+   commit_list_insert((struct commit *)object, list);
return 0;
 }
 
@@ -1366,6 +1366,7 @@ static int get_sha1_with_context_1(const char *name,
if (!only_to_die  namelen  2  name[1] == '/') {
struct commit_list *list = NULL;
for_each_ref(handle_one_ref, list);
+   commit_list_sort_by_date(list);
return get_sha1_oneline(name + 2, sha1, list);
}
if (namelen  3 ||
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] walker: avoid quadratic list insertion in mark_complete

2014-08-21 Thread René Scharfe
Similar to 16445242 (fetch-pack: avoid quadratic list insertion in
mark_complete), sort only after all refs are collected instead of while
inserting.  The result is the same, but it's more efficient that way.
The difference will only be measurable in repositories with a large
number of refs.

Signed-off-by: Rene Scharfe l@web.de
---
 walker.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/walker.c b/walker.c
index 0148264..0596e99 100644
--- a/walker.c
+++ b/walker.c
@@ -205,7 +205,7 @@ static int mark_complete(const char *path, const unsigned 
char *sha1, int flag,
struct commit *commit = lookup_commit_reference_gently(sha1, 1);
if (commit) {
commit-object.flags |= COMPLETE;
-   commit_list_insert_by_date(commit, complete);
+   commit_list_insert(commit, complete);
}
return 0;
 }
@@ -271,8 +271,10 @@ int walker_fetch(struct walker *walker, int targets, char 
**target,
}
}
 
-   if (!walker-get_recover)
+   if (!walker-get_recover) {
for_each_ref(mark_complete, NULL);
+   commit_list_sort_by_date(complete);
+   }
 
for (i = 0; i  targets; i++) {
if (interpret_target(walker, target[i], sha1[20 * i])) {
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] Handling unmerged files with merged entries

2014-08-21 Thread Johannes Sixt
Am 21.08.2014 00:19, schrieb Junio C Hamano:
 For that, we need to catch an index whose entries are not sorted and
 error out, perhaps when read_index_from() iterates over the mmapped
 index entries.  We can even draw that hopelessly corrupt line
 above the breakage you are addressing and add a check to make sure
 no path has both merged and unmerged entries to the same check to
 make it error out.

Except that we can't declare an index with both merged and unmerged
entries as hopelessly corrupt, return to sender when it's dead easy to
generate with the git tool set:

 x
 name=$(git hash-object -w x)
 for i in 0 1 2 3; do printf '100644 %s %d\tx\n' $name $i; done |
 git update-index --index-info

-- Hannes

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Check order when reading index

2014-08-21 Thread Junio C Hamano
Jaime Soriano Pastor jsorianopas...@gmail.com writes:

 Signed-off-by: Jaime Soriano Pastor jsorianopas...@gmail.com
 ---
  read-cache.c | 18 ++
  1 file changed, 18 insertions(+)

 diff --git a/read-cache.c b/read-cache.c
 index 7f5645e..e117d3a 100644
 --- a/read-cache.c
 +++ b/read-cache.c
 @@ -1438,6 +1438,21 @@ static struct cache_entry *create_from_disk(struct 
 ondisk_cache_entry *ondisk,
   return ce;
  }
  
 +void check_next_ce(struct cache_entry *ce, struct cache_entry *next_ce) {

Have opening brace for the function on its own line, i.e.

void check_next_ce(struct cache_entry *ce, struct cache_entry *next_ce)
{

The function might be misnamed (see below), though.

 + if (!ce || !next_ce)
 + return;

Hmph, would it be either a programming error or a corrupt index
input to see a NULL in either of these variables?

 + if (cache_name_compare(ce-name, ce_namelen(ce),
 +next_ce-name, 
 ce_namelen(next_ce))  1)

An odd indentation that is overly deep to make it hard to read.

if (cache_name_compare(ce-name, ce_namelen(ce),
   next_ce-name, ce_namelen(next_ce))  1)

should be sufficient (replacing 7-SP before next_ce with a HT is OK
if the existing code nearby does so).

What is the significance of the return value from cache_name_compare()
that is strictly greater than 1 (as opposed to merely is it positive?)?

Perhaps you want something that is modeled after ce_same_name() that
ignores the stage, i.e.

int ce_name_compare(const struct cache_entry *a, const struct 
cache_entry *b)
{
return strcmp(a-ce_name, b-ce_name);
}

without reimplementing the cache-name-compare() as

int bad_ce_same_name(const struct cache_entry *a, const struct 
cache_entry *b)
{
return !ce_same_name(a, b);
}

to keep the two names with different length could never be the
same optimization.

- if (0 = ce_name_compare(ce, next)) then the names are not sorted

- if (!stage(ce)  !name_compare(ce, next)) then the merged
  entry 'ce' is not the only entry for the path



 + die(Unordered stage entries in index);
 + if (ce_same_name(ce, next_ce)) {
 + if (!ce_stage(ce))
 + die(Multiple stage entries for merged file '%s',
 + ce-name);

 + if (ce_stage(ce) = ce_stage(next_ce))
 + die(Unordered stage entries for '%s', ce-name);
 + }
 +}
 +
  /* remember to discard_cache() before reading a different cache! */
  int read_index_from(struct index_state *istate, const char *path)
  {
 @@ -1499,6 +1514,9 @@ int read_index_from(struct index_state *istate, const 
 char *path)
   ce = create_from_disk(disk_ce, consumed, previous_name);
   set_index_entry(istate, i, ce);
  
 + if (i  0)
 + check_next_ce(istate-cache[i-1], ce);

Have a SP each on both sides of binary operator -.

Judging from the way this helper function is used, it looks like
check_with_previous_ce() is a more appropriate name.  After all, you
are not checking the next ce which you haven't even created yet ;-)


Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Hook post-merge does not get executed in case of confilicts

2014-08-21 Thread Bertram Scharpf
Hi,

today I wrote a port-merge hook. Then I just detected that it only gets
executed when the merge is immediately successful. In case there is a
conflict, I have to finish the merge using the command git commit.
This will not call the post-merge hook.

I think the hook should be reliable to be executed on _every_ non-failed
merge. Therefore I propose the below extension.

Bertram


diff --git a/builtin/commit.c b/builtin/commit.c
index 5ed6036..6a8ee2d 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1783,6 +1783,8 @@ int cmd_commit(int argc, const char **argv, const char 
*prefix)
 
rerere(0);
run_commit_hook(use_editor, get_index_file(), post-commit, NULL);
+   if (whence == FROM_MERGE)
+   run_hook_le(NULL, post-merge, 0, NULL);
if (amend  !no_post_rewrite) {
struct notes_rewrite_cfg *cfg;
cfg = init_copy_notes_for_rewrite(amend);


-- 
Bertram Scharpf
Stuttgart, Deutschland/Germany
http://www.bertram-scharpf.de
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Hook post-merge does not get executed in case of confilicts

2014-08-21 Thread Jonathan Nieder
Hi,

Bertram Scharpf wrote:

 today I wrote a port-merge hook. Then I just detected that it only gets
 executed when the merge is immediately successful. In case there is a
 conflict, I have to finish the merge using the command git commit.
 This will not call the post-merge hook.

 I think the hook should be reliable to be executed on _every_ non-failed
 merge. Therefore I propose the below extension.

I agree that at first glance this sounds like a good thing.  A manual
conflict resolution is not so different from a very smart merge
strategy, after all.

Nits:

 Bertram

Sign-off?  (See Documentation/SubmittingPatches, section 5 Sign your
work for what this means.

 --- a/builtin/commit.c
 +++ b/builtin/commit.c
 @@ -1783,6 +1783,8 @@ int cmd_commit(int argc, const char **argv, const char 
 *prefix)
  
   rerere(0);
   run_commit_hook(use_editor, get_index_file(), post-commit, NULL);
 + if (whence == FROM_MERGE)
 + run_hook_le(NULL, post-merge, 0, NULL);

git merge doesn't run the post-commit hook, so there's a new
asymmetry being introduced here.  Should git merge run the
post-commit hook?  Should a git commit that means git merge
--continue avoid running it?

Also if doing this for real, the documentation should be updated
and tests introduced to make sure the behavior doesn't get broken
in the future.  Documentation/githooks.txt currently says

This hook cannot affect the outcome of git merge and is not
executed if the merge failed due to conflicts.

which would need to be updated to say that the hook will run later
in that case, when the merge is finally committed.

Thanks and hope that helps,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 18/18] signed push: final protocol update

2014-08-21 Thread Shawn Pearce
On Tue, Aug 19, 2014 at 3:06 PM, Junio C Hamano gits...@pobox.com wrote:

 +  push-cert = PKT-LINE(push-cert NUL capability-list LF)

Haha. NUL.  I love our wire protocol.

 + PKT-LINE(certificate version 0.1 LF)
 + PKT-LINE(pusher ident LF)
 + PKT-LINE(LF)
 + *PKT-LINE(command LF)
 + *PKT-LINE(GPG signature lines LF)

Should we include the URL as part of this certificate?

Perhaps the pusher means to sign the master branch of experimental
tree, but not their trunk tree?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v20 43/48] refs.c: move the check for valid refname to lock_ref_sha1_basic

2014-08-21 Thread Ronnie Sahlberg
On Wed, Aug 20, 2014 at 11:34 AM, Michael Haggerty mhag...@alum.mit.edu wrote:
 On 08/20/2014 06:28 PM, Ronnie Sahlberg wrote:
 On Wed, Aug 20, 2014 at 7:52 AM, Michael Haggerty mhag...@alum.mit.edu 
 wrote:
 I'm a little worried that abandoning *all* refname checks could open us
 up to somehow trying to delete a reference with a name like
 ../../../../etc/passwd.  Either such names have to be prohibited
 somehow, or we have to be very sure that they can only come from trusted
 sources.

 I only set this flag from builtin/branch.c so it should only be used
 when a user runs 'git branch -D' from the command line.
 All other places where we delete branches we should still be checking
 the rename for badness.

 That said, unless the rules for good refname changes in the future,
 which is unlikely, is should be exceptionally rare that a user ends up
 with a bad refname in the first place.
 Perhaps my example I gave was bad since if you manually create bad
 refs using echo  .git/refs/heads/...  then you should probably know
 how to fix it too and thus maybe we do not need this patch in the
 first place.

 Do you want me to delete this patch and resend this part of the series
 ? Or is the 'only works for branch -D from the commandline' sufficient
 ?
 I have no strong feelings either way so I will just follow what you decide.

 I think that if you run the refname through normalize_path_copy_len()
 and that function returns (1) without an error, (2) without modifying
 its argument, and (3) the result does not begin with a
 has_dos_drive_prefix() or is_dir_sep(), then we should be safe against
 directory traversal attacks.  I suggest doing this kind of check even if
 not doing the full check_refname_format() check.

Good idea.
Let me add this.



 Michael

 --
 Michael Haggerty
 mhag...@alum.mit.edu

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4] Allow the user to change the temporary file name for mergetool

2014-08-21 Thread Junio C Hamano
Robin Rosenberg robin.rosenb...@dewire.com writes:

 Using the original filename suffix for the temporary input files to
 the merge tool confuses IDEs like Eclipse. This patch introduces
 a configurtion option, mergetool.tmpsuffix, which get appended to
 the temporary file name. That way the user can choose to use a
 suffix like .tmp, which does not cause confusion.

 Signed-off-by: Robin Rosenberg robin.rosenb...@dewire.com
 ---
  Documentation/config.txt|  5 +
  Documentation/git-mergetool.txt |  7 +++
  git-mergetool.sh| 10 ++
  3 files changed, 18 insertions(+), 4 deletions(-)

 Fixed a spelling error.

 diff --git a/Documentation/config.txt b/Documentation/config.txt
 index c55c22a..0e15800 100644
 --- a/Documentation/config.txt
 +++ b/Documentation/config.txt
 @@ -1778,6 +1778,11 @@ notes.displayRef::
   several times.  A warning will be issued for refs that do not
   exist, but a glob that does not match any refs is silently
   ignored.
 +
 +mergetool.tmpsuffix::
 + A string to append the names of the temporary files mergetool
 + creates in the worktree as input to a custom merge tool. The
 + primary use is to avoid confusion in IDEs during merge.
  +
  This setting can be overridden with the `GIT_NOTES_DISPLAY_REF`
  environment variable, which must be a colon separated list of refs or

Please read the surrounding text again and answer this question:

What is This setting the continued paragraph of your paragraph
that describes mergetool.tmpsuffix variable talks about?

Stated in another way, match any refs is silently ignored. is the
end of the first paragraph for notes.displayRef.  This setting can
be overridden is the beginning of the second paragraph for the same
variable.

 +`git mergetool` may also create other temporary files for the
 +different versions involved in the merge. By default these files have
 +the same filename suffix as the file being merged. This may confuse
 +other tools in use during a long merge operation. The user can set

I would suggest these changes:

 - replace being merged with being merged, so that editors and
   IDEs can use the suffix for syntax highlighting.

 - replace this may confuse other tools with this may confuse
   some tools.  The same tool that takes advantage of the suffix to
   syntax-highlight may also be confused in a way that you deem
   undesirable.

 - clarify what kind of confusion this warning is talking about, and
   offer an example to avoid such confusion, and drop in use during
   a long merge operation, as that phrase alone, without knowing
   in what way the tools are confused, is not useful to the readers.

For the last item, I unfortunately cannot offer a solid replacement
phrasing, as it was not quite clear from your explanation during the
discussion, at least to me.  Is it that Eclipse notices that a new
.java file in the working tree appeared and offers to add it or
something?  If that is the case, then perhaps I would suggest
something like this:

This reuse of the same file suffix may however confuse some
tools.  For example, Eclipse may notice, while resolving
conflicts on hello.java, that new files hello.LOCAL.java and
hello.REMOTE.java appear in your working tree and helpfully
offer to add it to your index and then upon conclusion of the
merge it would complain because these files are now gone.  To
avoid causing such confusion, you can use this variable to a
suffix that your IDE does not treat specially, e.g. .tmp (this
may obviously lose syntax highlighting, though).

But I am not sure what confusion you are trying to work around, so
the single sentence that begins with For example, above would need
to be completely rewritten, I guess.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] pretty: note that %cd respects the --date= option

2014-08-21 Thread Junio C Hamano
Thomas Braun thomas.br...@virtuell-zuhause.de writes:

 Signed-off-by: Thomas Braun thomas.br...@virtuell-zuhause.de
 ---

 Today I found out that both %cd and %ad pretty print format
 specifications honour the --date option as shown in:

 $ git log --abbrev=8 --date=short --pretty=%h (%s, %cd) -n1
 5bdb1c4e (Merge pull request #245 from
 kblees/kb/master/fix-libsvn-address-conflict, 2014-08-16)
 $ git log --abbrev=8 --date=short --pretty=%h (%s, %ad) -n1
 5bdb1c4e (Merge pull request #245 from
 kblees/kb/master/fix-libsvn-address-conflict, 2014-08-16)

 But the documentation did not mention that.

  Documentation/pretty-formats.txt | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/Documentation/pretty-formats.txt
 b/Documentation/pretty-formats.txt
 index 85d6353..eac7909 100644
 --- a/Documentation/pretty-formats.txt
 +++ b/Documentation/pretty-formats.txt
 @@ -122,7 +122,7 @@ The placeholders are:
  - '%ce': committer email
  - '%cE': committer email (respecting .mailmap, see
linkgit:git-shortlog[1] or linkgit:git-blame[1])
 -- '%cd': committer date
 +- '%cd': committer date (format respects --date= option)

Funny that we already have the same text for %ad.  Thanks.

  - '%cD': committer date, RFC2822 style
  - '%cr': committer date, relative
  - '%ct': committer date, UNIX timestamp
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] teach fast-export an --anonymize option

2014-08-21 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 +/*
 + * We anonymize each component of a path individually,
 + * so that paths a/b and a/c will share a common root.
 + * The paths are cached via anonymize_mem so that repeated
 + * lookups for a will yield the same value.
 + */
 +static void anonymize_path(struct strbuf *out, const char *path,
 +struct hashmap *map,
 +char *(*generate)(const char *, size_t *))
 +{
 + while (*path) {
 + const char *end_of_component = strchrnul(path, '/');
 + size_t len = end_of_component - path;
 + const char *c = anonymize_mem(map, generate, path, len);
 + strbuf_add(out, c, len);
 + path = end_of_component;
 + if (*path)
 + strbuf_addch(out, *path++);
 + }
 +}

Do two paths sort the same way before and after anonymisation?  For
example, if generate() works as a simple substitution, it should map
a character that sorts before (or after) '/' with another that also
sorts before (or after) '/' for us to be able to diagnose an error
that comes from D/F sort order confusion.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] teach fast-export an --anonymize option

2014-08-21 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 +--anonymize::
 + Replace all paths, blob contents, commit and tag messages,
 + names, and email addresses in the output with anonymized data,
 + while still retaining the shape of history and of the stored
 + tree.

Sometimes branch names can contain codenames the project may prefer
to hide from the general public, so they may need to be anonymised
as well.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[no subject]

2014-08-21 Thread Coca-Cola/ Fifa Promotion



Congratulations!!!Your email address has won $500,000 in  
Coca-Cola/Fifa Promotion. Ticket No: 7PW1124. Contact us on e-Mail:  
coke.f...@outlook.com for your claim


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] Handling unmerged files with merged entries

2014-08-21 Thread Junio C Hamano
Johannes Sixt j...@kdbg.org writes:

 Am 21.08.2014 00:19, schrieb Junio C Hamano:
 For that, we need to catch an index whose entries are not sorted and
 error out, perhaps when read_index_from() iterates over the mmapped
 index entries.  We can even draw that hopelessly corrupt line
 above the breakage you are addressing and add a check to make sure
 no path has both merged and unmerged entries to the same check to
 make it error out.

 Except that we can't declare an index with both merged and unmerged
 entries as hopelessly corrupt, return to sender when it's dead easy to
 generate with the git tool set:

  x
  name=$(git hash-object -w x)
  for i in 0 1 2 3; do printf '100644 %s %d\tx\n' $name $i; done |
  git update-index --index-info

Because hash-object and update-index deliberately have these holes
to allow us (read: me ;-) to easily experiment new and/or unallowed
formats, I wouldn't take that as a serious objection.  It is dead
easy to corrupt your repository or lose your data by /bin/rm, too
;-)

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] Added tests for the case of merged and unmerged entries for the same file

2014-08-21 Thread Junio C Hamano
Jaime Soriano Pastor jsorianopas...@gmail.com writes:

 On Wed, Aug 20, 2014 at 11:00 PM, Junio C Hamano gits...@pobox.com wrote:
 Jaime Soriano Pastor jsorianopas...@gmail.com writes:

 Signed-off-by: Jaime Soriano Pastor jsorianopas...@gmail.com
 ---
  t/t9904-unmerged-file-with-merged-entry.sh | 86 
 ++

 Isn't this number already used for another test?  A test on the
 index probably belongs to t2XXX or t3XXX family.

 Umm, I though this test number was free, I just added it to the last+1
 position, if I finally add a test I'll take this into account. Thanks.

Please check t/README for classes of features and appropriate first
digit; also do not forget that there are topics by other people in
flight and you may need to at least check with the tip of the 'pu'
branch.

Thanks.

  1 file changed, 86 insertions(+)
  create mode 100755 t/t9904-unmerged-file-with-merged-entry.sh

 diff --git a/t/t9904-unmerged-file-with-merged-entry.sh 
 b/t/t9904-unmerged-file-with-merged-entry.sh
 new file mode 100755
 index 000..945bc1c
 --- /dev/null
 +++ b/t/t9904-unmerged-file-with-merged-entry.sh
 @@ -0,0 +1,86 @@
 +#!/bin/sh
 +
 +test_description='Operations with unmerged files with merged entries'
 +
 +. ./test-lib.sh
 +
 +setup_repository() {
 +...
 +}

 No error is checked here?

 This is only a helper function for setup, not a test itself.

So what?  If the set-up fails, we would want

$ sh t-my-test.sh -i

to immediately stop without going further.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/3] Introduce GIT_MMAP_LIMIT to allow testing expected mmap size

2014-08-21 Thread Junio C Hamano
Steffen Prohaska proha...@zib.de writes:

 Similar to testing expectations about malloc with GIT_ALLOC_LIMIT (see
 commit d41489), it can be useful to test expectations about mmap.

 This introduces a new environment variable GIT_MMAP_LIMIT to limit the
 largest allowed mmap length (in KB).  xmmap() is modified to check the
 limit.  Together with GIT_ALLOC_LIMIT tests can now easily confirm
 expectations about memory consumption.

 GIT_ALLOC_LIMIT will be used in the next commit to test that data will

I smell the need for s/ALLOC/MMAP/ here, but perhaps you did mean
ALLOC (I won't know until I check 3/3 ;-)

 be streamed to an external filter without mmaping the entire file.

 [commit d41489]: d41489a6424308dc9a0409bc2f6845aa08bd4f7d Add more large
 blob test cases

 Signed-off-by: Steffen Prohaska proha...@zib.de
 ---
  sha1_file.c | 17 -
  1 file changed, 16 insertions(+), 1 deletion(-)

 diff --git a/sha1_file.c b/sha1_file.c
 index 00c07f2..88d64c0 100644
 --- a/sha1_file.c
 +++ b/sha1_file.c
 @@ -663,10 +663,25 @@ void release_pack_memory(size_t need)
   ; /* nothing */
  }
  
 +static void mmap_limit_check(size_t length)
 +{
 + static int limit = -1;

Perhaps you want ssize_t here?  I see mmap() as a tool to handle a
lot more data than a single malloc() typically would ;-) so previous
mistakes by other people would not be a good excuse.

 + if (limit == -1) {
 + const char *env = getenv(GIT_MMAP_LIMIT);
 + limit = env ? atoi(env) * 1024 : 0;
 + }
 + if (limit  length  limit)
 + die(attempting to mmap %PRIuMAX over limit %d,
 + (intmax_t)length, limit);
 +}
 +
  void *xmmap(void *start, size_t length,
   int prot, int flags, int fd, off_t offset)
  {
 - void *ret = mmap(start, length, prot, flags, fd, offset);
 + void *ret;
 +
 + mmap_limit_check(length);
 + ret = mmap(start, length, prot, flags, fd, offset);
   if (ret == MAP_FAILED) {
   if (!length)
   return NULL;
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] teach fast-export an --anonymize option

2014-08-21 Thread Jeff King
On Thu, Aug 21, 2014 at 01:15:10PM -0700, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:
 
  +/*
  + * We anonymize each component of a path individually,
  + * so that paths a/b and a/c will share a common root.
  + * The paths are cached via anonymize_mem so that repeated
  + * lookups for a will yield the same value.
  + */
  +static void anonymize_path(struct strbuf *out, const char *path,
  +  struct hashmap *map,
  +  char *(*generate)(const char *, size_t *))
  +{
  +   while (*path) {
  +   const char *end_of_component = strchrnul(path, '/');
  +   size_t len = end_of_component - path;
  +   const char *c = anonymize_mem(map, generate, path, len);
  +   strbuf_add(out, c, len);
  +   path = end_of_component;
  +   if (*path)
  +   strbuf_addch(out, *path++);
  +   }
  +}
 
 Do two paths sort the same way before and after anonymisation?  For
 example, if generate() works as a simple substitution, it should map
 a character that sorts before (or after) '/' with another that also
 sorts before (or after) '/' for us to be able to diagnose an error
 that comes from D/F sort order confusion.

No, the sort order is totally lost. I'd be afraid that a general scheme
would end up leaking information about what was in the filenames. It
might be acceptable to leak some information here, though, if it adds to
the realism of the result.

I tried here to lay the basic infrastructure and do the simplest thing
that might work, so we could evaluate proposals like that independently
(and also because I didn't come up with a clever enough algorithm to do
what you're asking).  Patches welcome on top. :)

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] teach fast-export an --anonymize option

2014-08-21 Thread Jeff King
On Thu, Aug 21, 2014 at 02:57:22PM -0700, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:
 
  +--anonymize::
  +   Replace all paths, blob contents, commit and tag messages,
  +   names, and email addresses in the output with anonymized data,
  +   while still retaining the shape of history and of the stored
  +   tree.
 
 Sometimes branch names can contain codenames the project may prefer
 to hide from the general public, so they may need to be anonymised
 as well.

Yes, I do anonymize them (and check it in the tests). See
anonymize_refname. I just forgot to include it in the list. Trivial
squashable patch is below.

The few things I don't anonymize are:

  1. ref prefixes. We see the same distribution of refs/heads vs
 refs/tags, etc.

  2. refs/heads/master is left untouched, for convenience (and because
 it's not really a secret). The implementation is lazy, though, and
 would leave refs/heads/master-supersecret, as well. I can tighten
 that if we really want to be careful.

  3. gitlinks are left untouched, since sha1s cannot be reversed. This
 could leak some information (if your private repo points to a
 public, I can find out you have it as submodule). I doubt it
 matters, but we can also scramble the sha1s.

---
diff --git a/Documentation/git-fast-export.txt 
b/Documentation/git-fast-export.txt
index 0ec7cad..52831fa 100644
--- a/Documentation/git-fast-export.txt
+++ b/Documentation/git-fast-export.txt
@@ -106,10 +106,10 @@ marks the same across runs.
different from the commit's first parent).
 
 --anonymize::
-   Replace all paths, blob contents, commit and tag messages,
-   names, and email addresses in the output with anonymized data,
-   while still retaining the shape of history and of the stored
-   tree.
+   Replace all refnames, paths, blob contents, commit and tag
+   messages, names, and email addresses in the output with
+   anonymized data, while still retaining the shape of history and
+   of the stored tree.
 
 --refspec::
Apply the specified refspec to each ref exported. Multiple of them can
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] teach fast-export an --anonymize option

2014-08-21 Thread Jeff King
On Thu, Aug 21, 2014 at 06:49:10PM -0400, Jeff King wrote:

 The few things I don't anonymize are:
 
   1. ref prefixes. We see the same distribution of refs/heads vs
  refs/tags, etc.
 
   2. refs/heads/master is left untouched, for convenience (and because
  it's not really a secret). The implementation is lazy, though, and
  would leave refs/heads/master-supersecret, as well. I can tighten
  that if we really want to be careful.
 
   3. gitlinks are left untouched, since sha1s cannot be reversed. This
  could leak some information (if your private repo points to a
  public, I can find out you have it as submodule). I doubt it
  matters, but we can also scramble the sha1s.

Here's a re-roll that addresses the latter two. I don't think any are a
big deal, but it's much easier to say it's handled than try to figure
out whether and when it's important.

This also includes the documentation update I sent earlier. The
interdiff is a bit noisy, as I also converted the anonymize_mem function
to take void pointers (since it doesn't know or care what it's storing,
and this makes storing unsigned chars for sha1s easier).

-- 8 --
Subject: teach fast-export an --anonymize option

Sometimes users want to report a bug they experience on
their repository, but they are not at liberty to share the
contents of the repository. It would be useful if they could
produce a repository that has a similar shape to its history
and tree, but without leaking any information. This
anonymized repository could then be shared with developers
(assuming it still replicates the original problem).

This patch implements an --anonymize option to
fast-export, which generates a stream that can recreate such
a repository. Producing a single stream makes it easy for
the caller to verify that they are not leaking any useful
information. You can get an overview of what will be shared
by running a command like:

  git fast-export --anonymize --all |
  perl -pe 's/\d+/X/g' |
  sort -u |
  less

which will show every unique line we generate, modulo any
numbers (each anonymized token is assigned a number, like
User 0, and we replace it consistently in the output).

In addition to anonymizing, this produces test cases that
are relatively small (compared to the original repository)
and fast to generate (compared to using filter-branch, or
modifying the output of fast-export yourself). Here are
numbers for git.git:

  $ time git fast-export --anonymize --all \
 --tag-of-filtered-object=drop output
  real0m2.883s
  user0m2.828s
  sys 0m0.052s

  $ gzip output
  $ ls -lh output.gz | awk '{print $5}'
  2.9M

Signed-off-by: Jeff King p...@peff.net
---
 Documentation/git-fast-export.txt |   6 +
 builtin/fast-export.c | 300 --
 t/t9351-fast-export-anonymize.sh  | 117 +++
 3 files changed, 412 insertions(+), 11 deletions(-)
 create mode 100755 t/t9351-fast-export-anonymize.sh

diff --git a/Documentation/git-fast-export.txt 
b/Documentation/git-fast-export.txt
index 221506b..52831fa 100644
--- a/Documentation/git-fast-export.txt
+++ b/Documentation/git-fast-export.txt
@@ -105,6 +105,12 @@ marks the same across runs.
in the commit (as opposed to just listing the files which are
different from the commit's first parent).
 
+--anonymize::
+   Replace all refnames, paths, blob contents, commit and tag
+   messages, names, and email addresses in the output with
+   anonymized data, while still retaining the shape of history and
+   of the stored tree.
+
 --refspec::
Apply the specified refspec to each ref exported. Multiple of them can
be specified.
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 92b4624..b8182c2 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -18,6 +18,7 @@
 #include parse-options.h
 #include quote.h
 #include remote.h
+#include blob.h
 
 static const char *fast_export_usage[] = {
N_(git fast-export [rev-list-opts]),
@@ -34,6 +35,7 @@ static int full_tree;
 static struct string_list extra_refs = STRING_LIST_INIT_NODUP;
 static struct refspec *refspecs;
 static int refspecs_nr;
+static int anonymize;
 
 static int parse_opt_signed_tag_mode(const struct option *opt,
 const char *arg, int unset)
@@ -81,6 +83,76 @@ static int has_unshown_parent(struct commit *commit)
return 0;
 }
 
+struct anonymized_entry {
+   struct hashmap_entry hash;
+   const char *orig;
+   size_t orig_len;
+   const char *anon;
+   size_t anon_len;
+};
+
+static int anonymized_entry_cmp(const void *va, const void *vb,
+   const void *data)
+{
+   const struct anonymized_entry *a = va, *b = vb;
+   return a-orig_len != b-orig_len ||
+   memcmp(a-orig, b-orig, a-orig_len);
+}
+
+/*
+ * Basically keep a cache of X-Y so that we can repeatedly replace
+ * the same anonymized 

Re: [PATCH 18/18] signed push: final protocol update

2014-08-21 Thread Junio C Hamano
Shawn Pearce spea...@spearce.org writes:

 On Tue, Aug 19, 2014 at 3:06 PM, Junio C Hamano gits...@pobox.com wrote:

 +  push-cert = PKT-LINE(push-cert NUL capability-list LF)

 Haha. NUL.  I love our wire protocol.

 + PKT-LINE(certificate version 0.1 LF)
 + PKT-LINE(pusher ident LF)
 + PKT-LINE(LF)
 + *PKT-LINE(command LF)
 + *PKT-LINE(GPG signature lines LF)

 Should we include the URL as part of this certificate?

 Perhaps the pusher means to sign the master branch of experimental
 tree, but not their trunk tree?

Yes, in $gmane/255582 I cover this and also mention that we would
need some nonce from the receiving end to make it harder to
replay.

Currently I am leaning toward to add both pushed-to URL and also
nonce nonce, the latter of which the receiver can ask with
push-cert=nonce in its initial capability advertisement.

There are a few gotchas I can certainly use help on, especially from
a smart-http expert ;-).

 * pushed-to URL will identify the site and the repository, so
   you cannot MITM my push to an experimental server and replay it
   against the authoritative server.

   However, the receiving end may not even know what name its users
   call the repository being pushed into.  Obviously gethostname()
   may not be what the pusher called us, and getcwd() may not match
   the repository name without leading /var/repos/shard3/ path
   components stripped, for example.

   I am not sure if we even have the necessary information at
   send-pack.c::send_pack() level, where it already has an
   established connection to the server (hence it does not need to
   know to whom it is talking to).


 * The receiving end will issue push-cert=nonce in its initial
   capability advertisement, and this nonce will be given on the
   PUSH_CERT_NONCE environment to the pre/post-receive hooks, to
   allow the nonce nonce header in the signed certificate to be
   checked against it.  You cannot capture my an earlier push to the
   authoritative server and replay it later.

   That would all work well within a single receive-pack process,
   but with stateless RPC, it is unclear to me how we should
   arrange the nonce the initial instance of receive-pack placed
   on its capability advertisement to be securely passed to the
   instance of receive-pack that actually receives the push
   certificate.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] receive-pack: GPG-validate push certificates

2014-08-21 Thread David Turner
On Wed, 2014-08-20 at 12:38 -0700, Junio C Hamano wrote:
 David Turner dtur...@twopensource.com writes:
 
  On Wed, 2014-08-20 at 10:29 -0700, Junio C Hamano wrote:
  On Wed, Aug 20, 2014 at 9:56 AM, David Turner dtur...@twopensource.com 
  wrote:
   On Tue, 2014-08-19 at 15:06 -0700, Junio C Hamano wrote:
   Reusing the GPG signature check helpers we already have, verify
   the signature in receive-pack and give the results to the hooks
   via GIT_PUSH_CERT_{SIGNER,KEY,STATUS} environment variables.
  
   Policy decisions, such as accepting or rejecting a good signature by
   a key that is not fully trusted, is left to the hook and kept
   outside of the core.
  
   If I understand correctly, the hook does not have enough information to
   make this decision, because it is missing the date from the signature.
  
  The full certificate is available to the hook so anything we can do the 
  hook
  has enough information to do ;-)  But of course we should try to make it
  easier for the hook to validate the request.
 
  Excellent, then motivated hooks can do the right thing.
 
   This might allow an old signed push to be replayed, moving the head of a
   branch to an older state (say, one lacking the latest security updates).
  
  ... with old-sha1 recorded in the certificate?
 
  That does prevent most replays, but it does not prevent resurrection of
  a deleted branch by a replay of its initial creation (nor an undo of a
  force-push to rollback).  So I think we still need timestamps, but
  parsing them out of the cert is not terrible.
 
 As I aleady mentioned elsewhere, a more problematic thing about the
 push certificate as presented in 15/18 is that it does not say
 anything about where the push is going.  If you can capture a trial
 push to some random test repository I do with my signed push
 certificate, you could replay it to my public repository hosted at
 a more official site (say, k.org in the far distant future where it
 does not rely on ssh authentication to protect their services but
 uses the GPG signature on the push certificate to make sure it is I
 who is pushing).
 
 We can add a new pushed-to repository URL header line to the
 certificate, next to pushed-by ident time, and have the
 receiving end verify that it matches to prevent such a replay.  I
 wonder if we can further extend it to avoid replays to the same
 repository.

I think but am not certain that pushed-to repository URL, along with
the pushed-by ident time means that the nonce is not needed. The
nonce might make replays harder, but pushed-to/pushed-by makes replays
useless since the receiving server can determine that the user intended
to take this action at this time on this server. 

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] receive-pack: GPG-validate push certificates

2014-08-21 Thread Junio C Hamano
If you ignore the clock skew between the pusher and the receiver, then
you are correct,
but otherwise not quite.  Also by specifying that as nonce, not
server-timestamp,
the receiving end has a choice in how to generate and use the nonce
value. The only
requirement on the protocol is that the pusher must parrot it literally.

On Thu, Aug 21, 2014 at 4:59 PM, David Turner dtur...@twopensource.com wrote:
 On Wed, 2014-08-20 at 12:38 -0700, Junio C Hamano wrote:
 David Turner dtur...@twopensource.com writes:

  On Wed, 2014-08-20 at 10:29 -0700, Junio C Hamano wrote:
  On Wed, Aug 20, 2014 at 9:56 AM, David Turner dtur...@twopensource.com 
  wrote:
   On Tue, 2014-08-19 at 15:06 -0700, Junio C Hamano wrote:
   Reusing the GPG signature check helpers we already have, verify
   the signature in receive-pack and give the results to the hooks
   via GIT_PUSH_CERT_{SIGNER,KEY,STATUS} environment variables.
  
   Policy decisions, such as accepting or rejecting a good signature by
   a key that is not fully trusted, is left to the hook and kept
   outside of the core.
  
   If I understand correctly, the hook does not have enough information to
   make this decision, because it is missing the date from the signature.
 
  The full certificate is available to the hook so anything we can do the 
  hook
  has enough information to do ;-)  But of course we should try to make it
  easier for the hook to validate the request.
 
  Excellent, then motivated hooks can do the right thing.
 
   This might allow an old signed push to be replayed, moving the head of a
   branch to an older state (say, one lacking the latest security updates).
 
  ... with old-sha1 recorded in the certificate?
 
  That does prevent most replays, but it does not prevent resurrection of
  a deleted branch by a replay of its initial creation (nor an undo of a
  force-push to rollback).  So I think we still need timestamps, but
  parsing them out of the cert is not terrible.

 As I aleady mentioned elsewhere, a more problematic thing about the
 push certificate as presented in 15/18 is that it does not say
 anything about where the push is going.  If you can capture a trial
 push to some random test repository I do with my signed push
 certificate, you could replay it to my public repository hosted at
 a more official site (say, k.org in the far distant future where it
 does not rely on ssh authentication to protect their services but
 uses the GPG signature on the push certificate to make sure it is I
 who is pushing).

 We can add a new pushed-to repository URL header line to the
 certificate, next to pushed-by ident time, and have the
 receiving end verify that it matches to prevent such a replay.  I
 wonder if we can further extend it to avoid replays to the same
 repository.

 I think but am not certain that pushed-to repository URL, along with
 the pushed-by ident time means that the nonce is not needed. The
 nonce might make replays harder, but pushed-to/pushed-by makes replays
 useless since the receiving server can determine that the user intended
 to take this action at this time on this server.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 18/18] signed push: final protocol update

2014-08-21 Thread David Turner
On Tue, 2014-08-19 at 15:06 -0700, Junio C Hamano wrote:
  
 +If the receiving end does not support push-cert, the sending end MUST
 +NOT send a push-cert command.
 +
 +When a push-cert command is sent, command-list MUST NOT be sent; the
 +commands recorded in the push certificate is used instead.  The GPG
 +signature lines are detached signature for the contents recorded in

are a detached signature

 +the push certificate before the signature block begins and is used

which is used (or and are used)

 +to certify that the commands were given by the pusher which must be

, who must be

 +the signer.
 +
 +
 +The receive-pack server that advertises this capability is willing
 +to accept a signed push certificate.  A send-pack client MUST NOT
 +send push-cert packet unless the receive-pack server advertises this

packets (or a push-cert packet)

  
 +static void queue_commands_from_cert(struct command **p,

Uninformative parameter name p.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 18/18] signed push: final protocol update

2014-08-21 Thread Kyle J. McKay

On Aug 21, 2014, at 16:40, Junio C Hamano wrote:


* The receiving end will issue push-cert=nonce in its initial
  capability advertisement, and this nonce will be given on the
  PUSH_CERT_NONCE environment to the pre/post-receive hooks, to
  allow the nonce nonce header in the signed certificate to be
  checked against it.  You cannot capture my an earlier push to the
  authoritative server and replay it later.

  That would all work well within a single receive-pack process,
  but with stateless RPC, it is unclear to me how we should
  arrange the nonce the initial instance of receive-pack placed
  on its capability advertisement to be securely passed to the
  instance of receive-pack that actually receives the push
  certificate.


Have you considered having the advertised nonce only be updated after  
receipt of a successful signed push?


It would eliminate the stateless issue.  And since the next nonce to  
be advertised would be updated at the successful completion of a  
receive of a signed push no replay would be possible.  (I'm assuming  
that receive hook activity is already pipelined in the case of  
simultaneous pushes via some lock file or something or this scheme  
falls apart.)


The obvious downside is that only one of two or more simultaneous  
signed pushers could succeed.  But the sender could be modified to  
automatically retry (a limited number of times) on a nonce mismatch  
error.


A receive hook could also be responsible for generating the next nonce  
value using this technique.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Wishlist: git fetch --reference

2014-08-21 Thread Howard Chu
I maintain multiple copies of the same repo because I keep each one checked 
out to different branch/rev levels. It would be nice if, similar to clone 
--reference, we could also use git fetch --reference to reference a local repo 
when doing a fetch to pull in updates.


--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] walker: avoid quadratic list insertion in mark_complete

2014-08-21 Thread Jeff King
On Thu, Aug 21, 2014 at 08:30:24PM +0200, René Scharfe wrote:

 Similar to 16445242 (fetch-pack: avoid quadratic list insertion in
 mark_complete), sort only after all refs are collected instead of while
 inserting.  The result is the same, but it's more efficient that way.
 The difference will only be measurable in repositories with a large
 number of refs.

Thanks, this looks obviously correct.

I wonder if we should do this on top:

diff --git a/walker.c b/walker.c
index 0148264..70088b8 100644
--- a/walker.c
+++ b/walker.c
@@ -203,7 +203,7 @@ static int interpret_target(struct walker *walker, char 
*target, unsigned char *
 static int mark_complete(const char *path, const unsigned char *sha1, int 
flag, void *cb_data)
 {
struct commit *commit = lookup_commit_reference_gently(sha1, 1);
-   if (commit) {
+   if (commit  !(commit-object.flags  COMPLETE)) {
commit-object.flags |= COMPLETE;
commit_list_insert_by_date(commit, complete);
}

It's not as big a deal with your patch since you've made it O(n log n),
but reducing n further does not hurt.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sha1_name: avoid quadratic list insertion in handle_one_ref

2014-08-21 Thread Jeff King
On Thu, Aug 21, 2014 at 08:30:29PM +0200, René Scharfe wrote:

 Similar to 16445242 (fetch-pack: avoid quadratic list insertion in
 mark_complete), sort only after all refs are collected instead of while
 inserting.  The result is the same, but it's more efficient that way.
 The difference will only be measurable in repositories with a large
 number of refs.

Looks good, thanks.

I was hoping one of these would be fixing the quadratic http-push
behavior I mentioned yesterday, but alas. We seem to have a lot of
quadratic spots to fix. :)

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Makefile: use find to determine static header dependencies

2014-08-21 Thread Jeff King
On Thu, Aug 21, 2014 at 07:48:18AM -0700, Jonathan Nieder wrote:

 Subject: i18n: treat make pot as an explicitly-invoked target
 
 po/git.pot is normally used as-is and not regenerated by people
 building git, so it is okay if an explicit make po/git.pot always
 automatically regenerates it.  Depend on the magic FORCE target
 instead of explicitly keeping track of dependencies.
 
 This simplifies the makefile, in particular preparing for a moment
 when $(LIB_H), which is part of $(LOCALIZED_C), can be computed on the
 fly.
 
 We still need a dependency on GENERATED_H, to force those files to be
 built when regenerating git.pot.
 
 Signed-off-by: Jonathan Nieder jrnie...@gmail.com

Yeah, this is way less gross than what I proposed, and I do not think it
hurts anything. We do still need to drop the use of := in assigning
LOCALIZED_C, but I do not think there is any need for it in the first
place.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 18/18] signed push: final protocol update

2014-08-21 Thread Junio C Hamano
On Thu, Aug 21, 2014 at 12:28 PM, Shawn Pearce spea...@spearce.org wrote:
 On Tue, Aug 19, 2014 at 3:06 PM, Junio C Hamano gits...@pobox.com wrote:

 +  push-cert = PKT-LINE(push-cert NUL capability-list LF)

 Haha. NUL.  I love our wire protocol.

It is a direct and natural consequence of [PATCH 02/18].

We could use SP here, if we really wanted to, but that would make the
push-cert packet a special kind that is different from others, which we
would want to avoid. shallow is already special in that it cannot even
carry the feature request, and it is not worth introducing and advertising
a new capability to fix it, but at least we can avoid making the same
mistake here.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/3] dropping manually-maintained LIB_H

2014-08-21 Thread Jeff King
On Fri, Aug 22, 2014 at 12:12:36AM -0400, Jeff King wrote:

  po/git.pot is normally used as-is and not regenerated by people
  building git, so it is okay if an explicit make po/git.pot always
  automatically regenerates it.  Depend on the magic FORCE target
  instead of explicitly keeping track of dependencies.
 
 Yeah, this is way less gross than what I proposed, and I do not think it
 hurts anything. We do still need to drop the use of := in assigning
 LOCALIZED_C, but I do not think there is any need for it in the first
 place.

Here's a re-roll of my series on top of your patch. In addition to
rebasing, I also switched it to use $(FIND) in the shell snippet rather
than a bare find.

I notice that for the ctags generation we actually try git ls-tree
first and then fall back to find. I guess we could do that here, but I
do not think the speed improvement matters much. And I think the find
output is a little more conservative. If you are adding a new header
file but have not mentioned it to git yet, I think we would prefer to
err on the side of including it as a potential dependency.

  [1/3]: i18n: treat make pot as an explicitly-invoked target
  [2/3]: Makefile: use `find` to determine static header dependencies
  [3/3]: Makefile: drop CHECK_HEADER_DEPENDENCIES code

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] i18n: treat make pot as an explicitly-invoked target

2014-08-21 Thread Jeff King
From: Jonathan Nieder jrnie...@gmail.com

po/git.pot is normally used as-is and not regenerated by people
building git, so it is okay if an explicit make po/git.pot always
automatically regenerates it.  Depend on the magic FORCE target
instead of explicitly keeping track of dependencies.

This simplifies the makefile, in particular preparing for a moment
when $(LIB_H), which is part of $(LOCALIZED_C), can be computed on the
fly. It also fixes a slight breakage in which changes to perl and shell
scripts did not trigger a rebuild of po/git.pot.

We still need a dependency on GENERATED_H, to force those files to be
built when regenerating git.pot.

Signed-off-by: Jonathan Nieder jrnie...@gmail.com
Signed-off-by: Jeff King p...@peff.net
---
Mostly as you sent it, but I mentioned the missing script dependencies
in the commit message, too.

 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 2320de5..cf0ccdf 100644
--- a/Makefile
+++ b/Makefile
@@ -2138,7 +2138,7 @@ LOCALIZED_SH += t/t0200/test.sh
 LOCALIZED_PERL += t/t0200/test.perl
 endif
 
-po/git.pot: $(LOCALIZED_C)
+po/git.pot: $(GENERATED_H) FORCE
$(QUIET_XGETTEXT)$(XGETTEXT) -o$@+ $(XGETTEXT_FLAGS_C) $(LOCALIZED_C)
$(QUIET_XGETTEXT)$(XGETTEXT) -o$@+ --join-existing $(XGETTEXT_FLAGS_SH) 
\
$(LOCALIZED_SH)
-- 
2.1.0.346.ga0367b9

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] Makefile: use `find` to determine static header dependencies

2014-08-21 Thread Jeff King
Most modern platforms will use automatically computed header
dependencies to figure out when a C file needs rebuilt due
to a header changing. With old compilers, however, we
fallback to a static list of header files. If any of them
changes, we recompile everything. This is overly
conservative, but the best we can do on older platforms.

It is unfortunately easy for our static header list to grow
stale, as none of the regular developers make use of it.
Instead of trying to keep it up to date, let's invoke find
to generate the list dynamically.

Since we do not use the value $(LIB_H) unless either
COMPUTE_HEADER_DEPENDENCIES is turned on or the user is
building po/git.pot (where it comes in via $(LOCALIZED_C),
make is smart enough to not even run this find in most
cases. However, we do need to stop using the immediate
variable assignment := for $(LOCALIZED_C). That's OK,
because it was not otherwise useful here.

Signed-off-by: Jeff King p...@peff.net
---
I cannot see any reason for the :=, but maybe I am missing something.

 Makefile | 140 ---
 1 file changed, 8 insertions(+), 132 deletions(-)

diff --git a/Makefile b/Makefile
index cf0ccdf..f2b85c9 100644
--- a/Makefile
+++ b/Makefile
@@ -432,7 +432,6 @@ XDIFF_OBJS =
 VCSSVN_OBJS =
 GENERATED_H =
 EXTRA_CPPFLAGS =
-LIB_H =
 LIB_OBJS =
 PROGRAM_OBJS =
 PROGRAMS =
@@ -631,131 +630,11 @@ VCSSVN_LIB = vcs-svn/lib.a
 
 GENERATED_H += common-cmds.h
 
-LIB_H += advice.h
-LIB_H += archive.h
-LIB_H += argv-array.h
-LIB_H += attr.h
-LIB_H += bisect.h
-LIB_H += blob.h
-LIB_H += branch.h
-LIB_H += builtin.h
-LIB_H += bulk-checkin.h
-LIB_H += bundle.h
-LIB_H += cache-tree.h
-LIB_H += cache.h
-LIB_H += color.h
-LIB_H += column.h
-LIB_H += commit.h
-LIB_H += compat/bswap.h
-LIB_H += compat/mingw.h
-LIB_H += compat/obstack.h
-LIB_H += compat/poll/poll.h
-LIB_H += compat/precompose_utf8.h
-LIB_H += compat/terminal.h
-LIB_H += compat/win32/dirent.h
-LIB_H += compat/win32/pthread.h
-LIB_H += compat/win32/syslog.h
-LIB_H += connected.h
-LIB_H += convert.h
-LIB_H += credential.h
-LIB_H += csum-file.h
-LIB_H += decorate.h
-LIB_H += delta.h
-LIB_H += diff.h
-LIB_H += diffcore.h
-LIB_H += dir.h
-LIB_H += exec_cmd.h
-LIB_H += ewah/ewok.h
-LIB_H += ewah/ewok_rlw.h
-LIB_H += fetch-pack.h
-LIB_H += fmt-merge-msg.h
-LIB_H += fsck.h
-LIB_H += gettext.h
-LIB_H += git-compat-util.h
-LIB_H += gpg-interface.h
-LIB_H += graph.h
-LIB_H += grep.h
-LIB_H += hashmap.h
-LIB_H += help.h
-LIB_H += http.h
-LIB_H += kwset.h
-LIB_H += levenshtein.h
-LIB_H += line-log.h
-LIB_H += line-range.h
-LIB_H += list-objects.h
-LIB_H += ll-merge.h
-LIB_H += log-tree.h
-LIB_H += mailmap.h
-LIB_H += merge-blobs.h
-LIB_H += merge-recursive.h
-LIB_H += mergesort.h
-LIB_H += notes-cache.h
-LIB_H += notes-merge.h
-LIB_H += notes-utils.h
-LIB_H += notes.h
-LIB_H += object.h
-LIB_H += pack-objects.h
-LIB_H += pack-revindex.h
-LIB_H += pack.h
-LIB_H += pack-bitmap.h
-LIB_H += parse-options.h
-LIB_H += patch-ids.h
-LIB_H += pathspec.h
-LIB_H += pkt-line.h
-LIB_H += prio-queue.h
-LIB_H += progress.h
-LIB_H += prompt.h
-LIB_H += quote.h
-LIB_H += reachable.h
-LIB_H += reflog-walk.h
-LIB_H += refs.h
-LIB_H += remote.h
-LIB_H += rerere.h
-LIB_H += resolve-undo.h
-LIB_H += revision.h
-LIB_H += run-command.h
-LIB_H += send-pack.h
-LIB_H += sequencer.h
-LIB_H += sha1-array.h
-LIB_H += sha1-lookup.h
-LIB_H += shortlog.h
-LIB_H += sideband.h
-LIB_H += sigchain.h
-LIB_H += strbuf.h
-LIB_H += streaming.h
-LIB_H += string-list.h
-LIB_H += submodule.h
-LIB_H += tag.h
-LIB_H += tar.h
-LIB_H += thread-utils.h
-LIB_H += transport.h
-LIB_H += tree-walk.h
-LIB_H += tree.h
-LIB_H += unpack-trees.h
-LIB_H += unicode_width.h
-LIB_H += url.h
-LIB_H += urlmatch.h
-LIB_H += userdiff.h
-LIB_H += utf8.h
-LIB_H += varint.h
-LIB_H += vcs-svn/fast_export.h
-LIB_H += vcs-svn/line_buffer.h
-LIB_H += vcs-svn/repo_tree.h
-LIB_H += vcs-svn/sliding_window.h
-LIB_H += vcs-svn/svndiff.h
-LIB_H += vcs-svn/svndump.h
-LIB_H += walker.h
-LIB_H += wildmatch.h
-LIB_H += wt-status.h
-LIB_H += xdiff-interface.h
-LIB_H += xdiff/xdiff.h
-LIB_H += xdiff/xdiffi.h
-LIB_H += xdiff/xemit.h
-LIB_H += xdiff/xinclude.h
-LIB_H += xdiff/xmacros.h
-LIB_H += xdiff/xprepare.h
-LIB_H += xdiff/xtypes.h
-LIB_H += xdiff/xutils.h
+LIB_H = $(shell $(FIND) . \
+   -name .git -prune -o \
+   -name t -prune -o \
+   -name Documentation -prune -o \
+   -name '*.h' -print)
 
 LIB_OBJS += abspath.o
 LIB_OBJS += advice.o
@@ -1381,7 +1260,6 @@ ifdef NO_INET_PTON
 endif
 ifndef NO_UNIX_SOCKETS
LIB_OBJS += unix-socket.o
-   LIB_H += unix-socket.h
PROGRAM_OBJS += credential-cache.o
PROGRAM_OBJS += credential-cache--daemon.o
 endif
@@ -1405,12 +1283,10 @@ endif
 ifdef BLK_SHA1
SHA1_HEADER = block-sha1/sha1.h
LIB_OBJS += block-sha1/sha1.o
-   LIB_H += block-sha1/sha1.h
 else
 ifdef PPC_SHA1
SHA1_HEADER = ppc/sha1.h
LIB_OBJS += ppc/sha1.o ppc/sha1ppc.o
-   

[PATCH 3/3] Makefile: drop CHECK_HEADER_DEPENDENCIES code

2014-08-21 Thread Jeff King
This code was useful when we kept a static list of header
files, and it was easy to forget to update it. Since the last
commit, we generate the list dynamically.

Technically this could still be used to find a dependency
that our dynamic check misses (e.g., a header file without a
.h extension).  But that is reasonably unlikely to be
added, and even less likely to be noticed by this tool
(because it has to be run manually)., It is not worth
carrying around the cruft in the Makefile.

Signed-off-by: Jeff King p...@peff.net
---
Same as before.

 Makefile | 59 ---
 1 file changed, 59 deletions(-)

diff --git a/Makefile b/Makefile
index f2b85c9..23e621f 100644
--- a/Makefile
+++ b/Makefile
@@ -317,9 +317,6 @@ all::
 # dependency rules.  The default is auto, which means to use computed header
 # dependencies if your compiler is detected to support it.
 #
-# Define CHECK_HEADER_DEPENDENCIES to check for problems in the hard-coded
-# dependency rules.
-#
 # Define NATIVE_CRLF if your platform uses CRLF for line endings.
 #
 # Define XDL_FAST_HASH to use an alternative line-hashing method in
@@ -904,11 +901,6 @@ sysconfdir = etc
 endif
 endif
 
-ifdef CHECK_HEADER_DEPENDENCIES
-COMPUTE_HEADER_DEPENDENCIES = no
-USE_COMPUTED_HEADER_DEPENDENCIES =
-endif
-
 ifndef COMPUTE_HEADER_DEPENDENCIES
 COMPUTE_HEADER_DEPENDENCIES = auto
 endif
@@ -1809,29 +1801,13 @@ $(dep_dirs):
 missing_dep_dirs := $(filter-out $(wildcard $(dep_dirs)),$(dep_dirs))
 dep_file = $(dir $@).depend/$(notdir $@).d
 dep_args = -MF $(dep_file) -MQ $@ -MMD -MP
-ifdef CHECK_HEADER_DEPENDENCIES
-$(error cannot compute header dependencies outside a normal build. \
-Please unset CHECK_HEADER_DEPENDENCIES and try again)
-endif
 endif
 
 ifneq ($(COMPUTE_HEADER_DEPENDENCIES),yes)
-ifndef CHECK_HEADER_DEPENDENCIES
 dep_dirs =
 missing_dep_dirs =
 dep_args =
 endif
-endif
-
-ifdef CHECK_HEADER_DEPENDENCIES
-ifndef PRINT_HEADER_DEPENDENCIES
-missing_deps = $(filter-out $(notdir $^), \
-   $(notdir $(shell $(MAKE) -s $@ \
-   CHECK_HEADER_DEPENDENCIES=YesPlease \
-   USE_COMPUTED_HEADER_DEPENDENCIES=YesPlease \
-   PRINT_HEADER_DEPENDENCIES=YesPlease)))
-endif
-endif
 
 ASM_SRC := $(wildcard $(OBJECTS:o=S))
 ASM_OBJ := $(ASM_SRC:S=o)
@@ -1839,45 +1815,10 @@ C_OBJ := $(filter-out $(ASM_OBJ),$(OBJECTS))
 
 .SUFFIXES:
 
-ifdef PRINT_HEADER_DEPENDENCIES
-$(C_OBJ): %.o: %.c FORCE
-   echo $^
-$(ASM_OBJ): %.o: %.S FORCE
-   echo $^
-
-ifndef CHECK_HEADER_DEPENDENCIES
-$(error cannot print header dependencies during a normal build. \
-Please set CHECK_HEADER_DEPENDENCIES and try again)
-endif
-endif
-
-ifndef PRINT_HEADER_DEPENDENCIES
-ifdef CHECK_HEADER_DEPENDENCIES
-$(C_OBJ): %.o: %.c $(dep_files) FORCE
-   @set -e; echo CHECK $@; \
-   missing_deps=$(missing_deps); \
-   if test $$missing_deps; \
-   then \
-   echo missing dependencies: $$missing_deps; \
-   false; \
-   fi
-$(ASM_OBJ): %.o: %.S $(dep_files) FORCE
-   @set -e; echo CHECK $@; \
-   missing_deps=$(missing_deps); \
-   if test $$missing_deps; \
-   then \
-   echo missing dependencies: $$missing_deps; \
-   false; \
-   fi
-endif
-endif
-
-ifndef CHECK_HEADER_DEPENDENCIES
 $(C_OBJ): %.o: %.c GIT-CFLAGS $(missing_dep_dirs)
$(QUIET_CC)$(CC) -o $*.o -c $(dep_args) $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) 
$
 $(ASM_OBJ): %.o: %.S GIT-CFLAGS $(missing_dep_dirs)
$(QUIET_CC)$(CC) -o $*.o -c $(dep_args) $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) 
$
-endif
 
 %.s: %.c GIT-CFLAGS FORCE
$(QUIET_CC)$(CC) -o $@ -S $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $
-- 
2.1.0.346.ga0367b9
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Wishlist: git fetch --reference

2014-08-21 Thread Howard Chu

Jeff King wrote:

On Thu, Aug 21, 2014 at 07:57:47PM -0700, Howard Chu wrote:


I maintain multiple copies of the same repo because I keep each one checked
out to different branch/rev levels. It would be nice if, similar to clone
--reference, we could also use git fetch --reference to reference a local
repo when doing a fetch to pull in updates.


I think it is just spelled:

   echo $reference_repo .git/objects/info/alternates
   git fetch

We need --reference with clone because that first line needs to happen
after clone runs git init but before it runs git fetch. And if you
cloned with --reference, of course, the alternates file remains and
further fetches will automatically use it.


Aha, thanks, hadn't realized that. Just checked and yes, the alternates file 
is already set in all of these different copies.


--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] unblock and unignore SIGPIPE

2014-08-21 Thread Patrick Reynolds
On Sun, Aug 17, 2014 at 8:14 PM, Eric Wong normalper...@yhbt.net wrote:
 But unicorn would ignore SIGPIPE it if Ruby did not; relying on SIGPIPE
 while doing any multiplexed I/O doesn't work well.

Exactly.  Callers block SIGPIPE for their own legitimate reasons, but they
don't consistently unblock it before spawning a git subprocess that needs
the default SIGPIPE behavior.  Easier to fix it in git than in every potential
caller.

--Patrick
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html