git issue with builds on Travis-CI

2018-10-12 Thread Maurice McCabe
There is a problem on travis-ci with doing builds on Pull Requests
with multiple jobs. For each job it will build off the FETCH_HEAD. The
problem is that if the FETCH_HEAD changes while the build is running
(due to a commit), the subsequent jobs will build off the new
FETCH_HEAD. This results in loss of integrity of the build.

The fix would have been doing a fetch by the SHA-1 (commit) instead of
the FETCH_HEAD. But this results in an error:
"error: Server does not allow request for unadvertised object"

Since this is not working, the current option available to the people
at travis-ci is to abandon the build with an error message. This is
not a good user experience. The preferred option is to trigger a new
build when a commit is made.

You can see more details about the problem in this comment:
https://github.com/travis-ci/travis-ci/issues/8577#issuecomment-336596843
and the current discussion:
https://github.com/travis-ci/travis-ci/issues/10210

BTW: on reading the git mailing list a suggestion seems to be to ask
the server operator (GitHub in this case) to set:
"git config uploadpack.allowReachableSHA1InWant 1"
Apparently this has some security implications.

Does anyone have a suggestion how to do a fetch by SHA-1? Or is there
another approach to solve this issue on travis-ci?

Thnx,

Maurice


Re: [PATCH v2 1/1] zlib.c: use size_t for size

2018-10-12 Thread Torsten Bögershausen
[]
> Neither v1 nor v2 of this patch compiles on 32 bit Linux; see
> 
>   https://travis-ci.org/git/git/jobs/440514375#L628
> 
> The fixup patch below makes it compile on 32 bit and the test suite
> passes as well, but I didn't thoroughly review the changes; I only
> wanted to get 'pu' build again.
> 

Oh, yes, I didn't test under Linux 32 bit (and neither Windows)
I will try to compose a proper v3 the next days.

[snip the good stuff]


Re: [PATCH] zlib.c: use size_t for size

2018-10-12 Thread Jeff King
On Fri, Oct 12, 2018 at 10:38:45PM -0400, Jeff King wrote:

> So right now let's imagine that off_t is 64-bit, and "unsigned long" is
> 32-bit (e.g., 32-bit system, or an IL32P64 model like Windows). We'll
> repeatedly ask use_pack() for a window, and it will tell us how many
> bytes we have in "avail". So even as a 32-bit value, that just means
> we'll process chunks smaller than 4GB, and this is correct (or at least
> this part of it -- hold on). But we can still process the whole "len"
> given by the off_t eventually.

So this "hold on" was because I thought I had found another bug in
use_pack(), but I actually think it's OK.

In use_pack(), we do this:

  if (left)
  *left = win->len - xsize_t(offset);

where win->len is a size_t. Before this patch, "left" is a pointer to
unsigned long. So that has the usual broken-on-Windows mismatch. This
patch makes it a size_t, which is good.

But what's up with that xsize_t(offset)? We'll complain about truncation
_before_ we do any offset subtraction. So at first glance, I thought
this meant we were broken for larger-than-4GB packs on 32-bit systems
when trying to read past the 4GB mark.

But no, right before that we have this line:

  offset -= win->offset;

So offset is in fact no longer its original meaning of "offset into the
packfile", but is now an offset of the specific request into the window
we found.

So I think it's correct, but it sure confused me. I wonder if another
variable might help, like:

  size_t offset_in_window;
  ...

  /*
   * We know this difference will fit in a size_t, because our mmap
   * window by * definition can be no larger than a size_t.
   */
  offset_in_window = xsize_t(offset - win->offset);
  if (left)
*left = win->len - offset_in_window;
  return win->base + offset_in_window;

I dunno. Maybe it is overkill.

-Peff


Re: [PATCH] zlib.c: use size_t for size

2018-10-12 Thread Jeff King
On Fri, Oct 12, 2018 at 04:07:25PM +0900, Junio C Hamano wrote:

> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index e6316d294d..b9ca04eb8a 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -266,15 +266,15 @@ static void copy_pack_data(struct hashfile *f,
>   struct packed_git *p,
>   struct pack_window **w_curs,
>   off_t offset,
> - off_t len)
> + size_t len)
>  {
>   unsigned char *in;
> - unsigned long avail;
> + size_t avail;

I know there were a lot of comments about "maybe this off_t switch is
not good". Let me say something a bit stronger: I think this part of the
change is strictly worse.

copy_pack_data() looks like this right now:

  static void copy_pack_data(struct hashfile *f,
  struct packed_git *p,
  struct pack_window **w_curs,
  off_t offset,
  off_t len)
  {
  unsigned char *in;
  unsigned long avail;
  
  while (len) {
  in = use_pack(p, w_curs, offset, );
  if (avail > len)
  avail = (unsigned long)len;
  hashwrite(f, in, avail);
  offset += avail;
  len -= avail;
  }
  }

So right now let's imagine that off_t is 64-bit, and "unsigned long" is
32-bit (e.g., 32-bit system, or an IL32P64 model like Windows). We'll
repeatedly ask use_pack() for a window, and it will tell us how many
bytes we have in "avail". So even as a 32-bit value, that just means
we'll process chunks smaller than 4GB, and this is correct (or at least
this part of it -- hold on). But we can still process the whole "len"
given by the off_t eventually.

But by switching away from off_t in the function interface, we risk
truncation before we even enter the loop. Because of the switch to
size_t, it actually works on an IL32P64 system (because size_t is big
there), but it has introduced a bug on a true 32-bit system. If your
off_t really is 64-bit (and it generally is because we #define
_FILE_OFFSET_BITS), the function will truncate modulo 2^32.

And nor will most compilers warn without -Wconversion. You can try it
with this on Linux:

  #define _FILE_OFFSET_BITS 64
  #include 
  
  void foo(size_t x);
  void bar(off_t x);
  
  void bar(off_t x)
  {
  
foo(x);
  }

That compiles fine with "gcc -c -m32 -Wall -Werror -Wextra" for me.
Adding "-Wconversion" catches it, but our code base is not close to
compiling with that warning enabled.

So I don't think this hunk is actually fixing any problems, and is
actually introducing one.

I do in general support moving to size_t over "unsigned long". Switching
avail to size_t makes sense here. It's just the off_t part that is
funny.

-Peff


Re: What's cooking in git.git (Oct 2018, #02; Sat, 13)

2018-10-12 Thread Junio C Hamano
Stefan Beller  writes:

>> * sb/submodule-recursive-fetch-gets-the-tip (2018-10-11) 9 commits
>>  . builtin/fetch: check for submodule updates for non branch fetches
>>  . fetch: retry fetching submodules if needed objects were not fetched
>>  . submodule: fetch in submodules git directory instead of in worktree
>>  . repository: repo_submodule_init to take a submodule struct
>>  . submodule.c: do not copy around submodule list
>>  . submodule.c: move global changed_submodule_names into fetch submodule 
>> struct
>>  . submodule.c: sort changed_submodule_names before searching it
>>  . submodule.c: fix indentation
>>  . sha1-array: provide oid_array_filter
>>
>>  "git fetch --recurse-submodules" may not fetch the necessary commit
>>  that is bound to the superproject, which is getting corrected.
>>
>>  Ejected for now, as it has fallouts in places like t/helper/.
>
> This is the first time I hear about that, I'll look into that.
> The tipmost commit there is also shoddy, I'll redo that.

This is the first time I saw the breakage with this series, but I
would not be suprised, as this was rerolled recently.  Who knows
what got changed in this series and in other topics---any new
interaction can arise and that is a normal part of distributed
development.

The xx/sb-submodule-recursive-fetch-gets-the-tip-in-pu branch at
git://github.com/gitster/git.git has a merge of this into 'pu', with
textual conflicts all resolved.  

At least t/helper/test-submodule-nested-repo-config.c fails to
build; I didn't check if there are other breakages.



Re: [RFC PATCH 1/2] fuzz: Add basic fuzz testing target.

2018-10-12 Thread Josh Steadmon
On 2018.10.10 11:14, Junio C Hamano wrote:
> Josh Steadmon  writes:
> 
> > +FUZZ_OBJS += fuzz-pack-headers.o
> > +
> > +FUZZ_PROGRAMS += $(patsubst %.o,%,$(FUZZ_OBJS))
> > +
> > ...
> > +### Fuzz testing
> > +#
> > +.PHONY: fuzz-clean fuzz-objs fuzz-compile
> 
> I take it that you anticipate the fuzz programs in the future all
> be named fuzz-$(blah), whose source is fuzz-$(blah).o (even though
> we may grow some common code that may be linked with them, which can
> be done by tweaking the rule for the $(FUZZ_PROGRAMS) target).  Am I
> reading you correctly?  Would fuzz-{clean,objs,compile} risk squatting
> on nicer names we may want to use for $(blah) down the line?

Yes, that's correct. I've reworked the rules to be more compatible with
how OSS-Fuzz expects to build these targets, and now "fuzz-all" is
the only remaining special target.

> > + ...
> > +$(FUZZ_PROGRAMS): fuzz-compile
> > +   clang++ $(FUZZ_LDFLAGS) $(LIB_OBJS) $(BUILTIN_OBJS) $(XDIFF_OBJS) \
> > +   $(EXTLIBS) git.o $@.o /usr/lib/llvm-4.0/lib/libFuzzer.a -o $@
> 
> Is the expected usage pattern to know a single fuzz-* program the
> builder wants to build, to run "make fuzz-pack-headers"?  If not, it
> also would be a good idea to have something like
> 
> fuzz-build-all:: $(FUZZ_PROGRAMS)
> .PHONY: fuzz-build-all
> 
> perhaps?
> 
> Also, in the final version we unleash to general developer audience,
> we'd want to support "make V=1" (and "make" that is "$(QUIET)").

Done and done.


[PATCH v2 2/2] fuzz: Add fuzz testing for packfile indices.

2018-10-12 Thread steadmon
From: Josh Steadmon 

Breaks the majority of check_packed_git_idx() into a separate function,
load_idx(). The latter function operates on arbitrary buffers, which
makes it suitable as a fuzzing test target.

Signed-off-by: Josh Steadmon 
---
 .gitignore  |  1 +
 Makefile|  1 +
 fuzz-pack-idx.c | 13 +
 packfile.c  | 44 +---
 packfile.h  | 13 +
 5 files changed, 53 insertions(+), 19 deletions(-)
 create mode 100644 fuzz-pack-idx.c

diff --git a/.gitignore b/.gitignore
index 87a28b3115..64b3377d40 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,5 +1,6 @@
 /fuzz_corpora
 /fuzz-pack-headers
+/fuzz-pack-idx
 /GIT-BUILD-OPTIONS
 /GIT-CFLAGS
 /GIT-LDFLAGS
diff --git a/Makefile b/Makefile
index 7f5a19b8ab..1b3d893090 100644
--- a/Makefile
+++ b/Makefile
@@ -685,6 +685,7 @@ SCRIPTS = $(SCRIPT_SH_INS) \
 ETAGS_TARGET = TAGS
 
 FUZZ_OBJS += fuzz-pack-headers.o
+FUZZ_OBJS += fuzz-pack-idx.o
 
 # Always build fuzz objects even if not testing, to prevent bit-rot.
 all:: $(FUZZ_OBJS)
diff --git a/fuzz-pack-idx.c b/fuzz-pack-idx.c
new file mode 100644
index 00..0c3d777aac
--- /dev/null
+++ b/fuzz-pack-idx.c
@@ -0,0 +1,13 @@
+#include "object-store.h"
+#include "packfile.h"
+
+int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size);
+
+int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
+{
+   struct packed_git p;
+
+   load_idx("fuzz-input", GIT_SHA1_RAWSZ, (void *)data, size, );
+
+   return 0;
+}
diff --git a/packfile.c b/packfile.c
index 841b36182f..86074a76e9 100644
--- a/packfile.c
+++ b/packfile.c
@@ -80,10 +80,8 @@ void pack_report(void)
 static int check_packed_git_idx(const char *path, struct packed_git *p)
 {
void *idx_map;
-   struct pack_idx_header *hdr;
size_t idx_size;
-   uint32_t version, nr, i, *index;
-   int fd = git_open(path);
+   int fd = git_open(path), ret;
struct stat st;
const unsigned int hashsz = the_hash_algo->rawsz;
 
@@ -101,16 +99,32 @@ static int check_packed_git_idx(const char *path, struct 
packed_git *p)
idx_map = xmmap(NULL, idx_size, PROT_READ, MAP_PRIVATE, fd, 0);
close(fd);
 
-   hdr = idx_map;
+   ret = load_idx(path, hashsz, idx_map, idx_size, p);
+
+   if (ret)
+   munmap(idx_map, idx_size);
+
+   return ret;
+}
+
+int load_idx(const char *path, const unsigned int hashsz, void *idx_map,
+size_t idx_size, struct packed_git *p)
+{
+   struct pack_idx_header *hdr = idx_map;
+   uint32_t version, nr, i, *index;
+
+   if (idx_size < 4 * 256 + hashsz + hashsz)
+   return error("index file %s is too small", path);
+   if (idx_map == NULL)
+   return error("empty data");
+
if (hdr->idx_signature == htonl(PACK_IDX_SIGNATURE)) {
version = ntohl(hdr->idx_version);
-   if (version < 2 || version > 2) {
-   munmap(idx_map, idx_size);
+   if (version < 2 || version > 2)
return error("index file %s is version %"PRIu32
 " and is not supported by this binary"
 " (try upgrading GIT to a newer version)",
 path, version);
-   }
} else
version = 1;
 
@@ -120,10 +134,8 @@ static int check_packed_git_idx(const char *path, struct 
packed_git *p)
index += 2;  /* skip index header */
for (i = 0; i < 256; i++) {
uint32_t n = ntohl(index[i]);
-   if (n < nr) {
-   munmap(idx_map, idx_size);
+   if (n < nr)
return error("non-monotonic index %s", path);
-   }
nr = n;
}
 
@@ -135,10 +147,8 @@ static int check_packed_git_idx(const char *path, struct 
packed_git *p)
 *  - hash of the packfile
 *  - file checksum
 */
-   if (idx_size != 4*256 + nr * (hashsz + 4) + hashsz + hashsz) {
-   munmap(idx_map, idx_size);
+   if (idx_size != 4 * 256 + nr * (hashsz + 4) + hashsz + hashsz)
return error("wrong index v1 file size in %s", path);
-   }
} else if (version == 2) {
/*
 * Minimum size:
@@ -157,20 +167,16 @@ static int check_packed_git_idx(const char *path, struct 
packed_git *p)
unsigned long max_size = min_size;
if (nr)
max_size += (nr - 1)*8;
-   if (idx_size < min_size || idx_size > max_size) {
-   munmap(idx_map, idx_size);
+   if (idx_size < min_size || idx_size > max_size)
return error("wrong index v2 file size in %s", path);
-   }
if (idx_size != min_size &&

[PATCH v2 0/2] add fuzzing targets for use with OSS-Fuzz

2018-10-12 Thread steadmon
From: Josh Steadmon 

V2 of this series pulls the compiler flags out of the Makefile, to be
provided by the user depending on the combination of compiler and
fuzzing engine in use. This also makes it more compatible with
OSS-Fuzz's build process.

Josh Steadmon (2):
  fuzz: Add basic fuzz testing target.
  fuzz: Add fuzz testing for packfile indices.

 .gitignore  |  3 +++
 Makefile| 33 +
 fuzz-pack-headers.c | 14 ++
 fuzz-pack-idx.c | 13 +
 packfile.c  | 44 +---
 packfile.h  | 13 +
 6 files changed, 101 insertions(+), 19 deletions(-)
 create mode 100644 fuzz-pack-headers.c
 create mode 100644 fuzz-pack-idx.c

Range-diff against v1:
1:  9456c41798 ! 1:  446d8081b1 fuzz: Add basic fuzz testing target.
@@ -32,6 +32,9 @@
  
 +FUZZ_OBJS += fuzz-pack-headers.o
 +
++# Always build fuzz objects even if not testing, to prevent bit-rot.
++all:: $(FUZZ_OBJS)
++
 +FUZZ_PROGRAMS += $(patsubst %.o,%,$(FUZZ_OBJS))
 +
  # Empty...
@@ -46,14 +49,13 @@
git.o
  ifndef NO_CURL
 @@
- cocciclean:
-   $(RM) contrib/coccinelle/*.cocci.patch*
- 
--clean: profile-clean coverage-clean cocciclean
-+clean: profile-clean coverage-clean cocciclean fuzz-clean
-   $(RM) *.res
-   $(RM) $(OBJECTS)
$(RM) $(LIB_FILE) $(XDIFF_LIB) $(VCSSVN_LIB)
+   $(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git$X
+   $(RM) $(TEST_PROGRAMS) $(NO_INSTALL)
++  $(RM) $(FUZZ_PROGRAMS)
+   $(RM) -r bin-wrappers $(dep_dirs)
+   $(RM) -r po/build/
+   $(RM) *.pyc *.pyo */*.pyc */*.pyo command-list.h $(ETAGS_TARGET) tags 
cscope*
 @@
  cover_db_html: cover_db
cover -report html -outputdir cover_db_html cover_db
@@ -61,24 +63,24 @@
 +
 +### Fuzz testing
 +#
-+.PHONY: fuzz-clean fuzz-objs fuzz-compile
-+
-+FUZZ_CFLAGS = $(CFLAGS) -fsanitize-coverage=trace-pc-guard 
-fsanitize=address
-+FUZZ_LDFLAGS = $(FUZZ_CFLAGS)
-+
-+
-+fuzz-clean:
-+  $(RM) $(FUZZ_PROGRAMS) $(FUZZ_OBJS)
-+
-+fuzz-objs: $(FUZZ_OBJS)
++# Building fuzz targets generally requires a special set of compiler 
flags that
++# are not necessarily appropriate for general builds, and that vary 
greatly
++# depending on the compiler version used.
++#
++# An example command to build against libFuzzer from LLVM 4.0.0:
++#
++# make CC=clang CXX=clang++ \
++#  CFLAGS="-fsanitize-coverage=trace-pc-guard -fsanitize=address" \
++#  LIB_FUZZING_ENGINE=/usr/lib/llvm-4.0/lib/libFuzzer.a \
++#  fuzz-all
++#
++.PHONY: fuzz-all
 +
-+fuzz-compile:
-+  $(MAKE) CC=clang LD=clang CFLAGS="$(FUZZ_CFLAGS)" \
-+  LDFLAGS="$(FUZZ_LDFLAGS)" all fuzz-objs
++$(FUZZ_PROGRAMS): all
++  $(QUIET_LINK)$(CXX) $(CFLAGS) $(LIB_OBJS) $(BUILTIN_OBJS) \
++  $(XDIFF_OBJS) $(EXTLIBS) git.o $@.o $(LIB_FUZZING_ENGINE) -o $@
 +
-+$(FUZZ_PROGRAMS): fuzz-compile
-+  clang++ $(FUZZ_LDFLAGS) $(LIB_OBJS) $(BUILTIN_OBJS) $(XDIFF_OBJS) \
-+  $(EXTLIBS) git.o $@.o /usr/lib/llvm-4.0/lib/libFuzzer.a -o $@
++fuzz-all: $(FUZZ_PROGRAMS)
 
  diff --git a/fuzz-pack-headers.c b/fuzz-pack-headers.c
  new file mode 100644
2:  581eb8f817 ! 2:  c7b5a03d81 fuzz: Add fuzz testing for packfile indices.
@@ -24,23 +24,8 @@
  FUZZ_OBJS += fuzz-pack-headers.o
 +FUZZ_OBJS += fuzz-pack-idx.o
  
- FUZZ_PROGRAMS += $(patsubst %.o,%,$(FUZZ_OBJS))
- 
-@@
- 
- ### Fuzz testing
- #
--.PHONY: fuzz-clean fuzz-objs fuzz-compile
-+.PHONY: fuzz-clean fuzz-objs fuzz-compile fuzz-all
- 
- FUZZ_CFLAGS = $(CFLAGS) -fsanitize-coverage=trace-pc-guard 
-fsanitize=address
- FUZZ_LDFLAGS = $(FUZZ_CFLAGS)
-@@
- $(FUZZ_PROGRAMS): fuzz-compile
-   clang++ $(FUZZ_LDFLAGS) $(LIB_OBJS) $(BUILTIN_OBJS) $(XDIFF_OBJS) \
-   $(EXTLIBS) git.o $@.o /usr/lib/llvm-4.0/lib/libFuzzer.a -o $@
-+
-+fuzz-all: $(FUZZ_PROGRAMS)
+ # Always build fuzz objects even if not testing, to prevent bit-rot.
+ all:: $(FUZZ_OBJS)
 
  diff --git a/fuzz-pack-idx.c b/fuzz-pack-idx.c
  new file mode 100644
-- 
2.19.0.605.g01d371f741-goog



[PATCH v2 1/2] fuzz: Add basic fuzz testing target.

2018-10-12 Thread steadmon
From: Josh Steadmon 

fuzz-pack-headers.c provides a fuzzing entry point compatible with
libFuzzer (and possibly other fuzzing engines).

Signed-off-by: Josh Steadmon 
---
 .gitignore  |  2 ++
 Makefile| 32 
 fuzz-pack-headers.c | 14 ++
 3 files changed, 48 insertions(+)
 create mode 100644 fuzz-pack-headers.c

diff --git a/.gitignore b/.gitignore
index 9d1363a1eb..87a28b3115 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,3 +1,5 @@
+/fuzz_corpora
+/fuzz-pack-headers
 /GIT-BUILD-OPTIONS
 /GIT-CFLAGS
 /GIT-LDFLAGS
diff --git a/Makefile b/Makefile
index 13e1c52478..7f5a19b8ab 100644
--- a/Makefile
+++ b/Makefile
@@ -590,6 +590,8 @@ XDIFF_OBJS =
 VCSSVN_OBJS =
 GENERATED_H =
 EXTRA_CPPFLAGS =
+FUZZ_OBJS =
+FUZZ_PROGRAMS =
 LIB_OBJS =
 PROGRAM_OBJS =
 PROGRAMS =
@@ -682,6 +684,13 @@ SCRIPTS = $(SCRIPT_SH_INS) \
 
 ETAGS_TARGET = TAGS
 
+FUZZ_OBJS += fuzz-pack-headers.o
+
+# Always build fuzz objects even if not testing, to prevent bit-rot.
+all:: $(FUZZ_OBJS)
+
+FUZZ_PROGRAMS += $(patsubst %.o,%,$(FUZZ_OBJS))
+
 # Empty...
 EXTRA_PROGRAMS =
 
@@ -2250,6 +2259,7 @@ TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) 
$(patsubst %,t/helper/%,$(TEST
 OBJECTS := $(LIB_OBJS) $(BUILTIN_OBJS) $(PROGRAM_OBJS) $(TEST_OBJS) \
$(XDIFF_OBJS) \
$(VCSSVN_OBJS) \
+   $(FUZZ_OBJS) \
common-main.o \
git.o
 ifndef NO_CURL
@@ -2937,6 +2947,7 @@ clean: profile-clean coverage-clean cocciclean
$(RM) $(LIB_FILE) $(XDIFF_LIB) $(VCSSVN_LIB)
$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git$X
$(RM) $(TEST_PROGRAMS) $(NO_INSTALL)
+   $(RM) $(FUZZ_PROGRAMS)
$(RM) -r bin-wrappers $(dep_dirs)
$(RM) -r po/build/
$(RM) *.pyc *.pyo */*.pyc */*.pyo command-list.h $(ETAGS_TARGET) tags 
cscope*
@@ -3061,3 +3072,24 @@ cover_db: coverage-report
 cover_db_html: cover_db
cover -report html -outputdir cover_db_html cover_db
 
+
+### Fuzz testing
+#
+# Building fuzz targets generally requires a special set of compiler flags that
+# are not necessarily appropriate for general builds, and that vary greatly
+# depending on the compiler version used.
+#
+# An example command to build against libFuzzer from LLVM 4.0.0:
+#
+# make CC=clang CXX=clang++ \
+#  CFLAGS="-fsanitize-coverage=trace-pc-guard -fsanitize=address" \
+#  LIB_FUZZING_ENGINE=/usr/lib/llvm-4.0/lib/libFuzzer.a \
+#  fuzz-all
+#
+.PHONY: fuzz-all
+
+$(FUZZ_PROGRAMS): all
+   $(QUIET_LINK)$(CXX) $(CFLAGS) $(LIB_OBJS) $(BUILTIN_OBJS) \
+   $(XDIFF_OBJS) $(EXTLIBS) git.o $@.o $(LIB_FUZZING_ENGINE) -o $@
+
+fuzz-all: $(FUZZ_PROGRAMS)
diff --git a/fuzz-pack-headers.c b/fuzz-pack-headers.c
new file mode 100644
index 00..99da1d0fd3
--- /dev/null
+++ b/fuzz-pack-headers.c
@@ -0,0 +1,14 @@
+#include "packfile.h"
+
+int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size);
+
+int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
+{
+   enum object_type type;
+   unsigned long len;
+
+   unpack_object_header_buffer((const unsigned char *)data,
+   (unsigned long)size, , );
+
+   return 0;
+}
-- 
2.19.0.605.g01d371f741-goog



Re: Git wire protocol v2 fails fetching shallow changes with `pack has XXX unresolved deltas` on large repos with lots of tags Inbox x

2018-10-12 Thread Jonathan Tan
> On large repositories with lots of tags - git wire protocol v2 fails
> to fetch shallow changes, it fails with error `pack has XXX unresolved
> deltas`. Unfortunately I didn't find easy way to reproduce it except
> cloning+fetching chromium repository, the way jenkins does.
> Reproduction steps:

[snip]

Thanks for your bug report and reproduction steps. I managed to
reproduce your issue and took a look.

The main issue seems to be that with v2, upload-pack doesn't pass
"--shallow X" to pack-objects (the write_one_shallow() callback is never
called, even if I change the "if (shallow_nr)" to "if (1)"), so
pack-objects probably doesn't know that some objects cannot be used as
delta bases. (With v0, write_one_shallow() is indeed called.) The issue
probably lies in how v0 and v2 handle client-provided shallows
differently.

There also seems to be another issue in that negotiation occurs
differently in these 2 protocols - I see the full list of "have" lines
being sent in the final request to the server in v0, but a very limited
list in v2. This might be because of the ref prefix limiting in v2, but
I haven't fully investigated it.

I'll look some more into this.


Re: [RFC PATCH 00/19] Bring more repository handles into our code base

2018-10-12 Thread Stefan Beller
On Fri, Oct 12, 2018 at 11:50 AM Jonathan Nieder  wrote:

>
> It appears that some patches use a the_index-style
> NO_THE_REPOSITORY_COMPATIBILITY_MACROS backward compatibility synonym
> and others don't.  Can you say a little more about this aspect of the
> approach?  Would the compatibility macros go away eventually?

I use the macro only when not doing the whole conversion in the patch
(i.e. there is a coccinelle patch IFF there is the macro and vice versa).

It's quite frankly a judgement call what I would convert as a whole
and what not, it depends on the usage of the functions and if I know
series that are in flight using it. The full conversion is easy to write if
there are less than a hand full of callers, so for the "small case", I just
did it, hoping it won't break other topics in flight.


Re: [PATCH 17/19] submodule: use submodule repos for object lookup

2018-10-12 Thread Stefan Beller
On Thu, Oct 11, 2018 at 3:41 PM Jonathan Tan  wrote:
>
> > +/*
> > + * Initialize 'out' based on the provided submodule path.
> > + *
> > + * Unlike repo_submodule_init, this tolerates submodules not present
> > + * in .gitmodules. NEEDSWORK: The repo_submodule_init behavior is
> > + * preferrable. This function exists only to preserve historical behavior.
>
> What do you mean by "The repo_submodule_init behavior is preferable"?

Preferable in the sense that it follows the definition of a submodule, but this
here works for "any repo" that happens to be at the gitlink.

>  If
> we need to preserve historical behavior, then it seems that the most
> preferable one is something that meets our needs (like open_submodule()
> in this patch).

Yes, I'll reword to drop the preferrable, but still state the difference.

I wonder if instead we'd want to introduce a

repo_submodule_init(struct repository *submodule \
struct repository *superproject \
const char *path, \
int tolerate_lookalikes)

Another patch proposes to replace the path
by a struct submodule, but for lookalikes, we do not have
a struct submodule to begin with (though in the other
patches we cook up a fake entry in the submodule config)

> > +static int open_submodule(struct repository *out, const char *path)
> > +{
> > + struct strbuf sb = STRBUF_INIT;
> > +
> > + if (submodule_to_gitdir(, path) || repo_init(out, sb.buf, NULL)) {
> > + strbuf_release();
> > + return -1;
> > + }
> > +
> > + out->submodule_prefix = xstrdup(path);
>
> Do we need to set submodule_prefix?

Good point! Thanks for catching this.

>
> > @@ -507,7 +533,7 @@ static void show_submodule_header(struct diff_options 
> > *o, const char *path,
> >   else if (is_null_oid(two))
> >   message = "(submodule deleted)";
> >
> > - if (add_submodule_odb(path)) {
> > + if (open_submodule(sub, path) < 0) {
>
> This function, as a side effect, writes the open repository to "sub" for
> its caller to use. I think it's better if its callers open "sub" and
> then pass it to show_submodule_header() if successful. If opening the
> submodule is expected to fail sometimes (like it seems here), then we
> can allow callers to also pass NULL, and document what happens when NULL
> is passed.

That looks intriguing, I'll take a look. Note that we also pass
in **left and **right to have it assigned in there.

>
> Also, repositories open in this way should also be freed.

Yes, thanks!


Re: [PATCH 05/19] object: parse_object to honor its repository argument

2018-10-12 Thread Stefan Beller
On Thu, Oct 11, 2018 at 3:11 PM Jonathan Tan  wrote:
>
> > In 8e4b0b6047 (object.c: allow parse_object to handle
> > arbitrary repositories, 2018-06-28), we forgot to pass the
> > repository down to the read_object_file.
>
> [snip]
>
> > @@ -270,7 +270,7 @@ struct object *parse_object(struct repository *r, const 
> > struct object_id *oid)
> >   return lookup_object(r, oid->hash);
> >   }
> >
> > - buffer = read_object_file(oid, , );
> > + buffer = repo_read_object_file(r, oid, , );
>
> There is still the matter of the 2 invocations of has_object_file()
> earlier in this function. The first one probably can be replaced with
> oid_object_info_extended() (see the definition of
> has_sha1_file_with_flags() to see how to do it), and the second one
> looks redundant to me and can be removed. Apart from that, I don't see
> any other invocations that need to be converted, so parse_object() will
> indeed fully honor its repository argument.

I will convert the has_{sha1, object}_file[_with_flags] functions
before this patch in a resend and just pass along the repository.

I'll defer the change of logic to another patch to be followed up later.


Re: [PATCH 1/1] protocol: limit max protocol version per service

2018-10-12 Thread Jonathan Nieder
Josh Steadmon wrote:
> On 2018.10.12 16:32, Jonathan Nieder wrote:
>> Josh Steadmon wrote:

>>> For now, I'm going to try adding an --allowed_versions flag for the
>>> remote helpers, but if anyone has a better idea, let me know.
>>
>> I forgot to mention: the stateless-connect remote helper capability is
>> still experimental so you're free to change its interface (e.g. to
>> change the syntax of the stateless-connect command it provides).
>
> For v2 of this patch, I ended up using GIT_PROTOCOL to pass the version
> string to the remote helpers.

Makes sense.  Thanks. :)


Re: [PATCH 1/1] protocol: limit max protocol version per service

2018-10-12 Thread Josh Steadmon
On 2018.10.12 16:32, Jonathan Nieder wrote:
> Josh Steadmon wrote:
> 
> > For now, I'm going to try adding an --allowed_versions flag for the
> > remote helpers, but if anyone has a better idea, let me know.
> 
> I forgot to mention: the stateless-connect remote helper capability is
> still experimental so you're free to change its interface (e.g. to
> change the syntax of the stateless-connect command it provides).

For v2 of this patch, I ended up using GIT_PROTOCOL to pass the version
string to the remote helpers.


Re: What's cooking in git.git (Oct 2018, #02; Sat, 13)

2018-10-12 Thread Stefan Beller
On Fri, Oct 12, 2018 at 12:44 PM Stefan Beller  wrote:

> > * sb/submodule-recursive-fetch-gets-the-tip (2018-10-11) 9 commits
> >  . builtin/fetch: check for submodule updates for non branch fetches
> >  . fetch: retry fetching submodules if needed objects were not fetched
> >  . submodule: fetch in submodules git directory instead of in worktree
> >  . repository: repo_submodule_init to take a submodule struct
> >  . submodule.c: do not copy around submodule list
> >  . submodule.c: move global changed_submodule_names into fetch submodule 
> > struct
> >  . submodule.c: sort changed_submodule_names before searching it
> >  . submodule.c: fix indentation
> >  . sha1-array: provide oid_array_filter
> >
> >  "git fetch --recurse-submodules" may not fetch the necessary commit
> >  that is bound to the superproject, which is getting corrected.
> >
> >  Ejected for now, as it has fallouts in places like t/helper/.
>
> This is the first time I hear about that, I'll look into that.

I looked into that, and merging with origin/next only
had one merge conflict in submodule.c, easy to resolve.
It builds and tests cleanly after that.

What fallouts did you observe?

> The tipmost commit there is also shoddy, I'll redo that.

I confused this series with sb/submodule-update-in-C
which got merged already, I may send a fixup commit there.

So it seems to me that this series is still good.

Thanks,
Stefan


Re: [PATCH 1/1] protocol: limit max protocol version per service

2018-10-12 Thread Jonathan Nieder
Josh Steadmon wrote:

> For now, I'm going to try adding an --allowed_versions flag for the
> remote helpers, but if anyone has a better idea, let me know.

I forgot to mention: the stateless-connect remote helper capability is
still experimental so you're free to change its interface (e.g. to
change the syntax of the stateless-connect command it provides).

Thanks again,
Jonathan


Re: [PATCH 1/1] protocol: limit max protocol version per service

2018-10-12 Thread Jonathan Nieder
Hi,

Josh Steadmon wrote:

> So this runs into problems with remote-curl (and possibly other remote
> helpers):
>
> builtin/push.c can declare whatever allowed versions it wants, but those
> are not carried over when remote-curl is started to actually talk to the
> remote. What's worse, remote-curl starts its HTTP connection before it
> knows what command it's actually acting as a helper for.
>
> For now, I'm going to try adding an --allowed_versions flag for the
> remote helpers, but if anyone has a better idea, let me know.

There are some external remote helpers (see "git help remote-helpers"
for the documented interface), so alas, they can't take new flags.

That said, you can add a new remote helper capability and then
communicate on stdin, or in a pinch you can use the existing "option"
capability.  Remote helpers are also allowed to query "git config" if
they want to (either using the config machinery in config.h or by
running "git config").

Thanks,
Jonathan


Re: [PATCH] zlib.c: use size_t for size

2018-10-12 Thread Ramsay Jones



On 12/10/18 16:34, Johannes Schindelin wrote:
> Hi Junio,
> 
> On Fri, 12 Oct 2018, Junio C Hamano wrote:
> 
>> Johannes Schindelin  writes:
>>
>>> Hi Junio,
>>>
>>> On Fri, 12 Oct 2018, Junio C Hamano wrote:
>>>
 From: Martin Koegler 
 Date: Thu, 10 Aug 2017 20:13:08 +0200

 Signed-off-by: Martin Koegler 
 Signed-off-by: Junio C Hamano 
 ---

  * I made minimal adjustments to make the change apply to today's
codebase.  I still find some choices and mixing of off_t and
size_t done by the patch a bit iffy, and some arith may need to
become st_addX().  Extra set of eyes are very much appreciated.

  builtin/pack-objects.c | 10 +-
  cache.h| 10 +-
  pack-check.c   |  6 +++---
  pack.h |  2 +-
  packfile.h |  2 +-
  wrapper.c  |  8 
  zlib.c |  8 
  7 files changed, 23 insertions(+), 23 deletions(-)

 diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
 index e6316d294d..b9ca04eb8a 100644
 --- a/builtin/pack-objects.c
 +++ b/builtin/pack-objects.c
 @@ -266,15 +266,15 @@ static void copy_pack_data(struct hashfile *f,
struct packed_git *p,
struct pack_window **w_curs,
off_t offset,
 -  off_t len)
 +  size_t len)
  {
unsigned char *in;
 -  unsigned long avail;
 +  size_t avail;
  
while (len) {
in = use_pack(p, w_curs, offset, );
if (avail > len)
>>>
>>> Do we have to be careful to handle sizeof(off_t) != sizeof(size_t) here? I
>>> guess we would receive a compiler warning about different sizes in that
>>> case, but it makes me worry.
>>
>> Yup, you just said exactly the same thing as I already said in the
>> part you quoted.  I still find the mixed use of off_t and size_t in
>> this patch iffy, and that was the secondary reason why the patch was
>> kept in the stalled state for so long.  The primary reason was that
>> nobody tried to dust it off and reignite the topic so far---which I
>> am trying to correct, but as I said, this is just minimally adjusted
>> to today's codebase, without any attempt to improve relative to the
>> original patch.
>>
>> I think we are in agreement in that this is not making things worse,
>> as we are already in the mixed arith territory before this patch.
> 
> I will *try* to find the time to audit this some more, then. Maybe next
> week, maybe the one afterwards... ;-)

This fails to compile on 32-bit Linux. The patch given below is
what I used to get git to build on 32-bit Linux (and it even
passes the tests).

I haven't even given the off_t -> size_t issue any thought, but I
suspect that will have to change as well. Anyway, I have no more
time tonight ...

ATB,
Ramsay Jones

-- >8 --
Subject: [PATCH] zlib: minimum fix-up on 32-bit linux

Signed-off-by: Ramsay Jones 
---
 builtin/pack-objects.c |  3 ++-
 packfile.c | 16 
 packfile.h |  2 +-
 3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 7e693071b..cc11a0426 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2002,7 +2002,8 @@ unsigned long oe_get_size_slow(struct packing_data *pack,
struct pack_window *w_curs;
unsigned char *buf;
enum object_type type;
-   unsigned long used, avail, size;
+   size_t used, avail;
+   unsigned long size;
 
if (e->type_ != OBJ_OFS_DELTA && e->type_ != OBJ_REF_DELTA) {
read_lock();
diff --git a/packfile.c b/packfile.c
index f2850a00b..7571ac988 100644
--- a/packfile.c
+++ b/packfile.c
@@ -585,7 +585,7 @@ static int in_window(struct pack_window *win, off_t offset)
 unsigned char *use_pack(struct packed_git *p,
struct pack_window **w_cursor,
off_t offset,
-   unsigned long *left)
+   size_t *left)
 {
struct pack_window *win = *w_cursor;
 
@@ -1034,19 +1034,19 @@ struct list_head *get_packed_git_mru(struct repository 
*r)
return >objects->packed_git_mru;
 }
 
-unsigned long unpack_object_header_buffer(const unsigned char *buf,
-   unsigned long len, enum object_type *type, unsigned long *sizep)
+size_t unpack_object_header_buffer(const unsigned char *buf,
+   size_t len, enum object_type *type, unsigned long *sizep)
 {
unsigned shift;
-   unsigned long size, c;
-   unsigned long used = 0;
+   size_t size, c;
+   size_t used = 0;
 
c = buf[used++];
*type = (c >> 4) & 7;
size = c & 15;
shift = 4;
while (c & 0x80) {
-   if (len <= used || bitsizeof(long) <= shift) {
+   if (len <= used || bitsizeof(size_t) <= shift) {
error("bad object header");
  

Re: [PATCH] range-diff: allow to diff files regardless submodule

2018-10-12 Thread brian m. carlson
On Fri, Oct 12, 2018 at 11:24:43AM +0200, Johannes Schindelin wrote:
> 
> 
> On Thu, 11 Oct 2018, Lucas De Marchi wrote:
> 
> > On Wed, Oct 10, 2018 at 5:02 PM brian m. carlson
> >  wrote:
> > >
> > > On Wed, Oct 10, 2018 at 08:09:16AM -0700, Lucas De Marchi wrote:
> > > > Do like it's done in grep so mode doesn't end up as
> > > > 016, which means range-diff doesn't work if one has
> > > > "submodule.diff = log" in the configuration. Without this
> > > > while using range-diff I only get a
> > > >
> > > > Submodule a 000...000 (new submodule)
> > > >
> > > > instead of the diff between the revisions.
> > > >
> > > > Signed-off-by: Lucas De Marchi 
> > > > ---
> > > >  range-diff.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/range-diff.c b/range-diff.c
> > > > index 60edb2f518..bd8083f2d1 100644
> > > > --- a/range-diff.c
> > > > +++ b/range-diff.c
> > > > @@ -354,7 +354,7 @@ static struct diff_filespec *get_filespec(const 
> > > > char *name, const char *p)
> > > >  {
> > > >   struct diff_filespec *spec = alloc_filespec(name);
> > > >
> > > > - fill_filespec(spec, _oid, 0, 0644);
> > > > + fill_filespec(spec, _oid, 0, 0100644);
> > >
> > > If we have a system that has different mode values from the common Unix
> > > ones, is this still correct or does it need to change?
> > 
> > From what I can see this would still be correct, or at least git-grep
> > implementation would be broken.
> 
> As you can see from the Windows port: we are stuck with the simplistic
> POSIX permissions in Git, and platforms that have a different permission
> system have to emulate it.

I think I may not have explained myself well.  There are a small number
of POSIXy systems which have mode bits that differ from the common ones
(e.g., a plain file is something other than 010).  I think one
person mentioned on the list that they have a homebrew Unix that works
this way, and I think I may have heard of some minor commercial Unices
that work this way as well.

My question was intended to ask whether we should be using an
OS-provided constant (e.g., S_IFREG) that represented that value
differently because it was a system value or whether it was the internal
Git representation.

I hadn't intended to inquire about Windows, as I was fairly confident
that this syntax does indeed work there through our compatibility layers
(because it has in the past even when we've had these kinds of issues on
other Unices).  But I'm glad that you chimed in and confirmed that it
does.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH v2 1/1] protocol: advertise multiple supported versions

2018-10-12 Thread Stefan Beller
On Thu, Oct 11, 2018 at 6:02 PM  wrote:
>
> From: Josh Steadmon 
>
> Currently the client advertises that it supports the wire protocol
> version set in the protocol.version config. However, not all services
> support the same set of protocol versions. When connecting to
> git-receive-pack, the client automatically downgrades to v0 if
> config.protocol is set to v2, but this check is not performed for other
> services.
>
> This patch creates a protocol version registry. Individual commands
> register all the protocol versions they support prior to communicating
> with a server. Versions should be listed in preference order; the
> version specified in protocol.version will automatically be moved to the
> front of the registry.
>
> The protocol version registry is passed to remote helpers via the
> GIT_PROTOCOL environment variable.
>
> Clients now advertise the full list of registered versions. Servers
> select the first recognized version from this advertisement.

I like this patch from the users point of view (i.e. inside fetch/push etc),
and I need to through the details in connect/protocol as that seems
to be a lot of new code, but hides the complexity very well.

> +   register_allowed_protocol_version(protocol_v2);
> +   register_allowed_protocol_version(protocol_v1);
> +   register_allowed_protocol_version(protocol_v0);

Would it make sense to have a

register_allowed_protocol_versions(protocol_v2, protocol_v1,
protocol_v0, NULL);

similar to argv_array_pushl  or would that be overengineered?

> @@ -1085,10 +1085,9 @@ static struct child_process *git_connect_git(int 
> fd[2], char *hostandport,
> target_host, 0);
>
> /* If using a new version put that stuff here after a second null 
> byte */
> -   if (version > 0) {
> +   if (strcmp(version_advert->buf, "version=0")) {
> strbuf_addch(, '\0');
> -   strbuf_addf(, "version=%d%c",
> -   version, '\0');
> +   strbuf_addf(, "%s%c", version_advert->buf, '\0');

It's a bit unfortunate to pass around the string, but reading through
the following
lines seems like it is easiest.


> @@ -1226,16 +1226,10 @@ struct child_process *git_connect(int fd[2], const 
> char *url,
>  {
> char *hostandport, *path;
> struct child_process *conn;
> +   struct strbuf version_advert = STRBUF_INIT;
> enum protocol protocol;
> -   enum protocol_version version = get_protocol_version_config();
>
> -   /*
> -* NEEDSWORK: If we are trying to use protocol v2 and we are planning
> -* to perform a push, then fallback to v0 since the client doesn't 
> know
> -* how to push yet using v2.
> -*/
> -   if (version == protocol_v2 && !strcmp("git-receive-pack", prog))
> -   version = protocol_v0;
> +   get_client_protocol_version_advertisement(_advert);

Nice to have this special casing gone!

> diff --git a/protocol.c b/protocol.c
> index 5e636785d1..f788269c47 100644
> +
> +static const char protocol_v0_string[] = "0";
> +static const char protocol_v1_string[] = "1";
> +static const char protocol_v2_string[] = "2";
> +
...
> +/* Return the text representation of a wire protocol version. */
> +static const char *format_protocol_version(enum protocol_version version)
> +{
> +   switch (version) {
> +   case protocol_v0:
> +   return protocol_v0_string;
> +   case protocol_v1:
> +   return protocol_v1_string;
> +   case protocol_v2:
> +   return protocol_v2_string;
> +   case protocol_unknown_version:
> +   die(_("Unrecognized protocol version"));
> +   }
> +   die(_("Unrecognized protocol_version"));
> +}

Up to now we have the textual representation that can easily
be constructed from protocol_version by using e.g.
sprintf(buf, "%d", version);
which is why I initially thought this could be worded way
shorter, but I guess this is more future proof?


> +
>  enum protocol_version get_protocol_version_config(void)
>  {
> const char *value;
> @@ -30,6 +55,79 @@ enum protocol_version get_protocol_version_config(void)
> return protocol_v0;
>  }
>
> +void register_allowed_protocol_version(enum protocol_version version)
> +{
> +   if (have_advertised_versions_already)
> +   error(_("attempting to register an allowed protocol version 
> after advertisement"));

This would be a
BUG(...)
instead?


Re: [PATCH] submodule helper: convert relative URL to absolute URL if needed

2018-10-12 Thread Jonathan Nieder
Hi,

Stefan Beller wrote:

> The submodule helper update_clone called by "git submodule update",
> clones submodules if needed. As submodules used to have the URL indicating
> if they were active, the step to resolve relative URLs was done in the
> "submodule init" step. Nowadays submodules can be configured active without
> calling an explicit init, e.g. via configuring submodule.active.
>
> Then we'll fallback to the URL found in the .gitmodules, which may be
> relative to the superproject, but we do not resolve it, yet.

Oh!  Good catch.

> To do so, factor out the function that resolves the relative URLs in
> "git submodule init" (in the submodule helper in the init_submodule
> function) and cal it at the appropriate place in the update_clone helper.

s/cal/call/

> Signed-off-by: Stefan Beller 
> ---
>  builtin/submodule--helper.c | 48 -
>  t/t7400-submodule-basic.sh  | 24 +++
>  2 files changed, 55 insertions(+), 17 deletions(-)

What is the symptom when this fails?

> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index f6fb8991f3..a9a3ac38be 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -584,6 +584,27 @@ static int module_foreach(int argc, const char **argv, 
> const char *prefix)
>   return 0;
>  }
>  
> +
> +char *compute_submodule_clone_url(const char *rel_url)

Should be static.

Is the caller responsible for freeing the returned buffer?

> +{
> + char *remoteurl, *relurl;
> + char *remote = get_default_remote();
> + struct strbuf remotesb = STRBUF_INIT;

optional: could rename, something like

struct strbuf key = STRBUF_INIT;

remote = get_default_remote();
strbuf_addf(, "remote.%s.url", remote);

 [...]
strbuf_release();
free(remote);
return result;


> +
> + strbuf_addf(, "remote.%s.url", remote);
> + free(remote);
> +
> + if (git_config_get_string(remotesb.buf, )) {
> + warning(_("could not lookup configuration '%s'. Assuming this 
> repository is its own authoritative upstream."), remotesb.buf);

nit: lookup is the noun, "look up" is the verb

> + remoteurl = xgetcwd();
> + }
> + relurl = relative_url(remoteurl, rel_url, NULL);
> + strbuf_release();
> + free(remoteurl);
> +
> + return relurl;
> +}
> +
>  struct init_cb {
>   const char *prefix;
>   unsigned int flags;
> @@ -634,21 +655,9 @@ static void init_submodule(const char *path, const char 
> *prefix,
>   /* Possibly a url relative to parent */
>   if (starts_with_dot_dot_slash(url) ||
>   starts_with_dot_slash(url)) {
> - char *remoteurl, *relurl;
> - char *remote = get_default_remote();
> - struct strbuf remotesb = STRBUF_INIT;
> - strbuf_addf(, "remote.%s.url", remote);
> - free(remote);
> -
> - if (git_config_get_string(remotesb.buf, )) {
> - warning(_("could not lookup configuration '%s'. 
> Assuming this repository is its own authoritative upstream."), remotesb.buf);
> - remoteurl = xgetcwd();
> - }
> - relurl = relative_url(remoteurl, url, NULL);
> - strbuf_release();
> - free(remoteurl);
> - free(url);
> - url = relurl;

Ah, this is moved code. I should have used --color-moved. ;-)

> + char *to_free = url;
> + url = compute_submodule_clone_url(url);
> + free(to_free);

Maybe:

char *old_url = url;
url = ...(old_url);
free(old_url);

>   }
>  
>   if (git_config_set_gently(sb.buf, url))
> @@ -1562,8 +1571,13 @@ static int prepare_to_clone_next_submodule(const 
> struct cache_entry *ce,
>  
>   strbuf_reset();
>   strbuf_addf(, "submodule.%s.url", sub->name);
> - if (repo_config_get_string_const(the_repository, sb.buf, ))
> - url = sub->url;
> + if (repo_config_get_string_const(the_repository, sb.buf, )) {
> + if (starts_with_dot_slash(sub->url) ||
> + starts_with_dot_dot_slash(sub->url))
> + url = compute_submodule_clone_url(sub->url);
> + else
> + url = sub->url;
> + }

Nice.

>  
>   strbuf_reset();
>   strbuf_addf(, "%s/.git", ce->name);
> diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
> index c0ffc1022a..3f5dd5e4ef 100755
> --- a/t/t7400-submodule-basic.sh
> +++ b/t/t7400-submodule-basic.sh
> @@ -1224,6 +1224,30 @@ test_expect_success 'submodule update and setting 
> submodule..active' '
>   test_cmp expect actual
>  '
>  
> +test_expect_success 'clone 

Re: [PATCH v2 1/1] zlib.c: use size_t for size

2018-10-12 Thread SZEDER Gábor
On Fri, Oct 12, 2018 at 10:42:29PM +0200, tbo...@web.de wrote:
> From: Martin Koegler 
> 
> Signed-off-by: Martin Koegler 
> Signed-off-by: Junio C Hamano 
> Signed-off-by: Torsten Bögershausen 
> ---
> 
> After doing a review, I decided to send the result as a patch.
> In general, the changes from off_t to size_t seem to be not really
> motivated.
> But if they are, they could and should go into an own patch.
> For the moment, change only "unsigned long" into size_t, thats all

Neither v1 nor v2 of this patch compiles on 32 bit Linux; see

  https://travis-ci.org/git/git/jobs/440514375#L628

The fixup patch below makes it compile on 32 bit and the test suite
passes as well, but I didn't thoroughly review the changes; I only
wanted to get 'pu' build again.


  --  >8 --

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 23c4cd8c77..89fe1c5d46 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1966,7 +1966,8 @@ unsigned long oe_get_size_slow(struct packing_data *pack,
struct pack_window *w_curs;
unsigned char *buf;
enum object_type type;
-   unsigned long used, avail, size;
+   unsigned long used, size;
+   size_t avail;
 
if (e->type_ != OBJ_OFS_DELTA && e->type_ != OBJ_REF_DELTA) {
read_lock();
diff --git a/packfile.c b/packfile.c
index 841b36182f..9f50411ad3 100644
--- a/packfile.c
+++ b/packfile.c
@@ -579,7 +579,7 @@ static int in_window(struct pack_window *win, off_t offset)
 unsigned char *use_pack(struct packed_git *p,
struct pack_window **w_cursor,
off_t offset,
-   unsigned long *left)
+   size_t *left)
 {
struct pack_window *win = *w_cursor;
 
@@ -1098,7 +1098,7 @@ int unpack_object_header(struct packed_git *p,
 unsigned long *sizep)
 {
unsigned char *base;
-   unsigned long left;
+   size_t left;
unsigned long used;
enum object_type type;
 


[PATCH] submodule helper: convert relative URL to absolute URL if needed

2018-10-12 Thread Stefan Beller
The submodule helper update_clone called by "git submodule update",
clones submodules if needed. As submodules used to have the URL indicating
if they were active, the step to resolve relative URLs was done in the
"submodule init" step. Nowadays submodules can be configured active without
calling an explicit init, e.g. via configuring submodule.active.

Then we'll fallback to the URL found in the .gitmodules, which may be
relative to the superproject, but we do not resolve it, yet.

To do so, factor out the function that resolves the relative URLs in
"git submodule init" (in the submodule helper in the init_submodule
function) and cal it at the appropriate place in the update_clone helper.

Signed-off-by: Stefan Beller 
---
 builtin/submodule--helper.c | 48 -
 t/t7400-submodule-basic.sh  | 24 +++
 2 files changed, 55 insertions(+), 17 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index f6fb8991f3..a9a3ac38be 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -584,6 +584,27 @@ static int module_foreach(int argc, const char **argv, 
const char *prefix)
return 0;
 }
 
+
+char *compute_submodule_clone_url(const char *rel_url)
+{
+   char *remoteurl, *relurl;
+   char *remote = get_default_remote();
+   struct strbuf remotesb = STRBUF_INIT;
+
+   strbuf_addf(, "remote.%s.url", remote);
+   free(remote);
+
+   if (git_config_get_string(remotesb.buf, )) {
+   warning(_("could not lookup configuration '%s'. Assuming this 
repository is its own authoritative upstream."), remotesb.buf);
+   remoteurl = xgetcwd();
+   }
+   relurl = relative_url(remoteurl, rel_url, NULL);
+   strbuf_release();
+   free(remoteurl);
+
+   return relurl;
+}
+
 struct init_cb {
const char *prefix;
unsigned int flags;
@@ -634,21 +655,9 @@ static void init_submodule(const char *path, const char 
*prefix,
/* Possibly a url relative to parent */
if (starts_with_dot_dot_slash(url) ||
starts_with_dot_slash(url)) {
-   char *remoteurl, *relurl;
-   char *remote = get_default_remote();
-   struct strbuf remotesb = STRBUF_INIT;
-   strbuf_addf(, "remote.%s.url", remote);
-   free(remote);
-
-   if (git_config_get_string(remotesb.buf, )) {
-   warning(_("could not lookup configuration '%s'. 
Assuming this repository is its own authoritative upstream."), remotesb.buf);
-   remoteurl = xgetcwd();
-   }
-   relurl = relative_url(remoteurl, url, NULL);
-   strbuf_release();
-   free(remoteurl);
-   free(url);
-   url = relurl;
+   char *to_free = url;
+   url = compute_submodule_clone_url(url);
+   free(to_free);
}
 
if (git_config_set_gently(sb.buf, url))
@@ -1562,8 +1571,13 @@ static int prepare_to_clone_next_submodule(const struct 
cache_entry *ce,
 
strbuf_reset();
strbuf_addf(, "submodule.%s.url", sub->name);
-   if (repo_config_get_string_const(the_repository, sb.buf, ))
-   url = sub->url;
+   if (repo_config_get_string_const(the_repository, sb.buf, )) {
+   if (starts_with_dot_slash(sub->url) ||
+   starts_with_dot_dot_slash(sub->url))
+   url = compute_submodule_clone_url(sub->url);
+   else
+   url = sub->url;
+   }
 
strbuf_reset();
strbuf_addf(, "%s/.git", ce->name);
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index c0ffc1022a..3f5dd5e4ef 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -1224,6 +1224,30 @@ test_expect_success 'submodule update and setting 
submodule..active' '
test_cmp expect actual
 '
 
+test_expect_success 'clone active submodule without submodule url set' '
+   test_when_finished "rm -rf test/test" &&
+   mkdir test &&
+   # another dir breaks accidental relative paths still being correct
+   git clone file://"$pwd"/multisuper test/test &&
+   (
+   cd test/test &&
+   git config submodule.active "." &&
+
+   # do not pass --init flag, as it is already active:
+   git submodule update &&
+   git submodule status >actual_raw &&
+
+   cut -c 1,43- actual_raw >actual &&
+   cat >expect <<-\EOF &&
+sub0 (test2)
+sub1 (test2)
+sub2 (test2)
+sub3 (test2)
+   EOF
+   test_cmp expect actual
+   )
+'
+
 

[PATCH 2/4] merge-recursive: increase marker length with depth of recursion

2018-10-12 Thread Elijah Newren
When using merge.conflictstyle=diff3 to have the "base version" be shown
in conflicts, there is the possibility that the base version itself has
conflicts in it.  This comes about when there are more than one merge
base, and the merging of those merge bases produces a conflict.
Since this process applies recursively, it is possible to have conflict
markers nested at an arbitrary depth; to be able to differentiate the
conflict markers from different nestings, we make them all of different
lengths.

Signed-off-by: Elijah Newren 
---
 ll-merge.c|  4 +++-
 ll-merge.h|  1 +
 merge-recursive.c | 22 +++---
 t/t6036-recursive-corner-cases.sh |  2 +-
 4 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/ll-merge.c b/ll-merge.c
index 0e2800f7bb..aabc1b5c2e 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -384,7 +384,9 @@ int ll_merge(mmbuffer_t *result_buf,
if (opts->virtual_ancestor) {
if (driver->recursive)
driver = find_ll_merge_driver(driver->recursive);
-   marker_size += 2;
+   }
+   if (opts->extra_marker_size) {
+   marker_size += opts->extra_marker_size;
}
return driver->fn(driver, result_buf, path, ancestor, ancestor_label,
  ours, our_label, theirs, their_label,
diff --git a/ll-merge.h b/ll-merge.h
index b72b19921e..5b4e158502 100644
--- a/ll-merge.h
+++ b/ll-merge.h
@@ -11,6 +11,7 @@ struct ll_merge_options {
unsigned virtual_ancestor : 1;
unsigned variant : 2;   /* favor ours, favor theirs, or union merge */
unsigned renormalize : 1;
+   unsigned extra_marker_size;
long xdl_opts;
 };
 
diff --git a/merge-recursive.c b/merge-recursive.c
index 5206d6cfb6..2452788d28 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1039,7 +1039,8 @@ static int merge_3way(struct merge_options *o,
  const struct diff_filespec *a,
  const struct diff_filespec *b,
  const char *branch1,
- const char *branch2)
+ const char *branch2,
+ const int extra_marker_size)
 {
mmfile_t orig, src1, src2;
struct ll_merge_options ll_opts = {0};
@@ -1047,6 +1048,7 @@ static int merge_3way(struct merge_options *o,
int merge_status;
 
ll_opts.renormalize = o->renormalize;
+   ll_opts.extra_marker_size = extra_marker_size;
ll_opts.xdl_opts = o->xdl_opts;
 
if (o->call_depth) {
@@ -1281,6 +1283,7 @@ static int merge_mode_and_contents(struct merge_options 
*o,
   const char *filename,
   const char *branch1,
   const char *branch2,
+  const int extra_marker_size,
   struct merge_file_info *result)
 {
result->merge = 0;
@@ -1321,7 +1324,8 @@ static int merge_mode_and_contents(struct merge_options 
*o,
int ret = 0, merge_status;
 
merge_status = merge_3way(o, _buf, one, a, b,
- branch1, branch2);
+ branch1, branch2,
+ extra_marker_size);
 
if ((merge_status < 0) || !result_buf.ptr)
ret = err(o, _("Failed to execute internal 
merge"));
@@ -1610,7 +1614,8 @@ static int handle_rename_rename_1to2(struct merge_options 
*o,
struct diff_filespec other;
struct diff_filespec *add;
if (merge_mode_and_contents(o, one, a, b, one->path,
-   ci->branch1, ci->branch2, ))
+   ci->branch1, ci->branch2,
+   o->call_depth * 2, ))
return -1;
 
/*
@@ -1677,9 +1682,11 @@ static int handle_rename_rename_2to1(struct 
merge_options *o,
path_side_1_desc = xstrfmt("%s (was %s)", path, a->path);
path_side_2_desc = xstrfmt("%s (was %s)", path, b->path);
if (merge_mode_and_contents(o, a, c1, >ren1_other, path_side_1_desc,
-   o->branch1, o->branch2, _c1) ||
+   o->branch1, o->branch2,
+   o->call_depth * 2, _c1) ||
merge_mode_and_contents(o, b, >ren2_other, c2, path_side_2_desc,
-   o->branch1, o->branch2, _c2))
+   o->branch1, o->branch2,
+   o->call_depth * 2, _c2))
return -1;
free(path_side_1_desc);
free(path_side_2_desc);
@@ -2725,7 +2732,7 @@ static int process_renames(struct merge_options 

[PATCH 3/4] merge-recursive: improve auto-merging messages with path collisions

2018-10-12 Thread Elijah Newren
Each individual file involved in a rename could have also been modified
on both sides of history, meaning it may need to have content merges.
If two such files are renamed into the same location, then on top of the
two natural auto-merging messages we also have to two-way merge the
result, giving us messages that look like

  Auto-merging somefile.c (was somecase.c)
  Auto-merging somefile.c (was somefolder.c)
  Auto-merging somefile.c

However, despite the fact that I was the one who put the "(was %s)"
portions into the messages (and just a few months ago), I was still
initially confused when running into a rename/rename(2to1) case and
wondered if somefile.c had been merged three times.  Update this to
instead be:

  Auto-merging version of somefile.c from somecase.c
  Auto-merging version of somefile.c from someportfolio.c
  Auto-merging somefile.c

This is an admittedly long set of messages for a single path, but you
only get all three messages when dealing with the rare case of a
rename/rename(2to1) conflict where both sides of both original files
were also modified, in conflicting ways.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 2452788d28..33cd9ee81f 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1679,8 +1679,8 @@ static int handle_rename_rename_2to1(struct merge_options 
*o,
remove_file(o, 1, a->path, o->call_depth || 
would_lose_untracked(a->path));
remove_file(o, 1, b->path, o->call_depth || 
would_lose_untracked(b->path));
 
-   path_side_1_desc = xstrfmt("%s (was %s)", path, a->path);
-   path_side_2_desc = xstrfmt("%s (was %s)", path, b->path);
+   path_side_1_desc = xstrfmt("version of %s from %s", path, a->path);
+   path_side_2_desc = xstrfmt("version of %s from %s", path, b->path);
if (merge_mode_and_contents(o, a, c1, >ren1_other, path_side_1_desc,
o->branch1, o->branch2,
o->call_depth * 2, _c1) ||
-- 
2.19.0.235.g7c386e1068



[PATCH 0/4] More merge cleanups

2018-10-12 Thread Elijah Newren
This series builds on en/merge-cleanup.  Technically, this could be
broken into three separate topics with only one of them depending on
en/merge-cleanup, but I have a subsequent series that depends on both
en/merge-cleanup and the fixes here, so I'm submitting these four
patches as a set.

Elijah Newren (4):
  t6036: add testcase where virtual merge base contains nested conflicts
  merge-recursive: increase marker length with depth of recursion

Improving diff3 conflict markers in the face of arbitrarily deeply
nested conflicts

  merge-recursive: improve auto-merging messages with path collisions

Simple attempt at wording improvement

  merge-recursive: Avoid showing conflicts with merge branch before HEAD

Conflict markers are expected to be shown in a certain order

 ll-merge.c|   4 +-
 ll-merge.h|   1 +
 merge-recursive.c |  59 +--
 t/t6036-recursive-corner-cases.sh | 159 +-
 4 files changed, 208 insertions(+), 15 deletions(-)

-- 
2.19.0.235.g7c386e1068



[PATCH 1/4] t6036: add testcase where virtual merge base contains nested conflicts

2018-10-12 Thread Elijah Newren
When merges involve content conflicts, merge.conflictstyle=diff3 can be
used to also show the content for the file from the base version.  If
there was more than one merge base, then the "base version" is
determined by merging the merge bases to create a virtual merge base.
This means that the "base version" of the file can itself already have
conflict markers.  In the past, this could have created a situation
where users have difficulty distinguishing between the inner and outer
conflict markers.  However, commit d694a17986a2 ("ll-merge: use a longer
conflict marker for internal merge", 2016-04-14) addressed this issue by
making conflict markers two characters longer for a virtual merge base.

Unfortunately, this solution was incomplete.  The recursive merge
algorithm is, as you'd expect from it's name, recursive in nature.  In
particular, this means that the virtual merge base itself could have had
a virtual merge base of its own that contained conflict markers.  In
other words, we can have three (or more) levels of conflict markers when
using merge.conflictstyle=diff3.  Thus, what we need is for conflict
marker length to increase with the depth of recursion.  Add a testcase
demonstrating the problem and testing for marker length increasing with
increasing recursion depth.

Signed-off-by: Elijah Newren 
---
 t/t6036-recursive-corner-cases.sh | 151 ++
 1 file changed, 151 insertions(+)

diff --git a/t/t6036-recursive-corner-cases.sh 
b/t/t6036-recursive-corner-cases.sh
index 59e52c5a09..1d1135082c 100755
--- a/t/t6036-recursive-corner-cases.sh
+++ b/t/t6036-recursive-corner-cases.sh
@@ -1402,4 +1402,155 @@ test_expect_failure 'check conflicting modes for 
regular file' '
)
 '
 
+# Setup:
+#  L1---L2---L3
+# /  \ /  \ /  \
+#   masterX1   X2   ?
+# \  / \  / \  /
+#  R1---R2---R3
+#
+# Where:
+#   master has one file named 'content'
+#   branches L1 and R1 both modify each of the two files in conflicting ways
+#
+#   L (n>1) is a merge of R into L
+#   R (n>1) is a merge of L into R
+#   L and R resolve the conflicts differently.
+#
+#   X is an auto-generated merge-base used when merging L and R.
+#   By construction, X1 has conflict markers due to conflicting versions.
+#   X2, due to using merge.conflictstyle=3, has nested conflict markers.
+#
+#   So, merging R3 into L3 using merge.conflictstyle=3 should show the
+#   nested conflict markers from X2 in the base version -- that means we
+#   have three levels of conflict markers.  Can we distinguish all three?
+
+test_expect_success "setup virtual merge base with nested conflicts" '
+   test_create_repo virtual_merge_base_has_nested_conflicts &&
+   (
+   cd virtual_merge_base_has_nested_conflicts &&
+
+   # Create some related files now
+   for i in $(test_seq 1 10)
+   do
+   echo Random base content line $i
+   done >content &&
+
+   # Setup original commit
+   git add content &&
+   test_tick && git commit -m initial &&
+
+   git branch L &&
+   git branch R &&
+
+   # Create L1
+   git checkout L &&
+   echo left >>content &&
+   git add content &&
+   test_tick && git commit -m "version L1 of content" &&
+   git tag L1 &&
+
+   # Create R1
+   git checkout R &&
+   echo right >>content &&
+   git add content &&
+   test_tick && git commit -m "verson R1 of content" &&
+   git tag R1 &&
+
+   # Create L2
+   git checkout L &&
+   test_must_fail git -c merge.conflictstyle=diff3 merge R1 &&
+   git checkout L1 content &&
+   test_tick && git commit -m "version L2 of content" &&
+   git tag L2 &&
+
+   # Create R2
+   git checkout R &&
+   test_must_fail git -c merge.conflictstyle=diff3 merge L1 &&
+   git checkout R1 content &&
+   test_tick && git commit -m "version R2 of content" &&
+   git tag R2 &&
+
+   # Create L3
+   git checkout L &&
+   test_must_fail git -c merge.conflictstyle=diff3 merge R2 &&
+   git checkout L1 content &&
+   test_tick && git commit -m "version L3 of content" &&
+   git tag L3 &&
+
+   # Create R3
+   git checkout R &&
+   test_must_fail git -c merge.conflictstyle=diff3 merge L2 &&
+   git checkout R1 content &&
+   test_tick && git commit -m "version R3 of content" &&
+   git tag R3
+   )
+'
+
+test_expect_failure "check virtual merge base with nested conflicts" '
+   (
+   cd virtual_merge_base_has_nested_conflicts &&
+
+   git checkout L3^0 &&
+
+ 

[PATCH 4/4] merge-recursive: Avoid showing conflicts with merge branch before HEAD

2018-10-12 Thread Elijah Newren
We want to load unmerged entries from HEAD into the index at stage 2 and
from MERGE_HEAD into stage 3.  Similarly, folks expect merge conflicts
to look like

 HEAD
content from our side

content from their side
 MERGE_HEAD

not

 MERGE_HEAD
content from their side

content from our side
 HEAD

The correct order usually comes naturally and for free, but with renames
we often have data in the form {rename_branch, other_branch}, and
working relative to the rename first (e.g. for rename/add) is more
convenient elsewhere in the code.  Address the slight impedance
mismatch by having some functions re-call themselves with flipped
arguments when the branch order is reversed.

Note that setup_rename_conflict_info() has one asymmetry in it, in
setting dst_entry1->processed=0 but not doing similarly for
dst_entry2->processed.  When dealing with rename/rename and similar
conflicts, we do not want the processing to happen twice, so the
desire to only set one of the entries to unprocessed is intentional.
So, while this change modifies which branch's entry will be marked as
unprocessed, that dovetails nicely with putting HEAD first so that we
get the index stage entries and conflict markers in the right order.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 33 ++-
 t/t6036-recursive-corner-cases.sh |  8 
 2 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 33cd9ee81f..f795c92a69 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -228,7 +228,26 @@ static inline void setup_rename_conflict_info(enum 
rename_type rename_type,
  struct stage_data *src_entry1,
  struct stage_data *src_entry2)
 {
-   struct rename_conflict_info *ci = xcalloc(1, sizeof(struct 
rename_conflict_info));
+   struct rename_conflict_info *ci;
+
+   /*
+* When we have two renames involved, it's easiest to get the
+* correct things into stage 2 and 3, and to make sure that the
+* content merge puts HEAD before the other branch if we just
+* ensure that branch1 == o->branch1.  So, simply flip arguments
+* around if we don't have that.
+*/
+   if (dst_entry2 && branch1 != o->branch1) {
+   setup_rename_conflict_info(rename_type,
+  pair2,  pair1,
+  branch2,branch1,
+  dst_entry2, dst_entry1,
+  o,
+  src_entry2, src_entry1);
+   return;
+   }
+
+   ci = xcalloc(1, sizeof(struct rename_conflict_info));
ci->rename_type = rename_type;
ci->pair1 = pair1;
ci->branch1 = branch1;
@@ -1286,6 +1305,18 @@ static int merge_mode_and_contents(struct merge_options 
*o,
   const int extra_marker_size,
   struct merge_file_info *result)
 {
+   if (o->branch1 != branch1) {
+   /*
+* It's weird getting a reverse merge with HEAD on the bottom
+* side of the conflict markers and the other branch on the
+* top.  Fix that.
+*/
+   return merge_mode_and_contents(o, one, b, a,
+  filename,
+  branch2, branch1,
+  extra_marker_size, result);
+   }
+
result->merge = 0;
result->clean = 1;
 
diff --git a/t/t6036-recursive-corner-cases.sh 
b/t/t6036-recursive-corner-cases.sh
index 21954db624..276b4e8792 100755
--- a/t/t6036-recursive-corner-cases.sh
+++ b/t/t6036-recursive-corner-cases.sh
@@ -230,13 +230,13 @@ test_expect_success 'git detects differently handled 
merges conflict' '
:2:new_a :3:new_a &&
test_cmp expect actual &&
 
-   git cat-file -p B:new_a >ours &&
-   git cat-file -p C:new_a >theirs &&
+   git cat-file -p C:new_a >ours &&
+   git cat-file -p B:new_a >theirs &&
>empty &&
test_must_fail git merge-file \
-   -L "Temporary merge branch 2" \
-   -L "" \
-L "Temporary merge branch 1" \
+   -L "" \
+   -L "Temporary merge branch 2" \
ours empty theirs &&
sed -e "s/^\([<=>]\)/\1\1\1/" ours >expect &&
git cat-file -p :1:new_a >actual &&
-- 
2.19.0.235.g7c386e1068



[PATCH v3] gpg-interface.c: detect and reject multiple signatures on commits

2018-10-12 Thread Michał Górny
GnuPG supports creating signatures consisting of multiple signature
packets.  If such a signature is verified, it outputs all the status
messages for each signature separately.  However, git currently does not
account for such scenario and gets terribly confused over getting
multiple *SIG statuses.

For example, if a malicious party alters a signed commit and appends
a new untrusted signature, git is going to ignore the original bad
signature and report untrusted commit instead.  However, %GK and %GS
format strings may still expand to the data corresponding
to the original signature, potentially tricking the scripts into
trusting the malicious commit.

Given that the use of multiple signatures is quite rare, git does not
support creating them without jumping through a few hoops, and finally
supporting them properly would require extensive API improvement, it
seems reasonable to just reject them at the moment.

Signed-off-by: Michał Górny 
---
 gpg-interface.c  | 94 +++-
 t/t7510-signed-commit.sh | 26 +++
 2 files changed, 91 insertions(+), 29 deletions(-)

Changes in v3: reworked the whole loop to iterate over lines rather
than scanning the whole buffer, as requested.  Now it also catches
duplicate instances of the same status.

diff --git a/gpg-interface.c b/gpg-interface.c
index db17d65f8..480aab4ee 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -75,48 +75,84 @@ void signature_check_clear(struct signature_check *sigc)
FREE_AND_NULL(sigc->key);
 }
 
+/* An exclusive status -- only one of them can appear in output */
+#define GPG_STATUS_EXCLUSIVE   (1<<0)
+
 static struct {
char result;
const char *check;
+   unsigned int flags;
 } sigcheck_gpg_status[] = {
-   { 'G', "\n[GNUPG:] GOODSIG " },
-   { 'B', "\n[GNUPG:] BADSIG " },
-   { 'U', "\n[GNUPG:] TRUST_NEVER" },
-   { 'U', "\n[GNUPG:] TRUST_UNDEFINED" },
-   { 'E', "\n[GNUPG:] ERRSIG "},
-   { 'X', "\n[GNUPG:] EXPSIG "},
-   { 'Y', "\n[GNUPG:] EXPKEYSIG "},
-   { 'R', "\n[GNUPG:] REVKEYSIG "},
+   { 'G', "GOODSIG ", GPG_STATUS_EXCLUSIVE },
+   { 'B', "BADSIG ", GPG_STATUS_EXCLUSIVE },
+   { 'U', "TRUST_NEVER", 0 },
+   { 'U', "TRUST_UNDEFINED", 0 },
+   { 'E', "ERRSIG ", GPG_STATUS_EXCLUSIVE },
+   { 'X', "EXPSIG ", GPG_STATUS_EXCLUSIVE },
+   { 'Y', "EXPKEYSIG ", GPG_STATUS_EXCLUSIVE },
+   { 'R', "REVKEYSIG ", GPG_STATUS_EXCLUSIVE },
 };
 
 static void parse_gpg_output(struct signature_check *sigc)
 {
const char *buf = sigc->gpg_status;
+   const char *line, *next;
int i;
-
-   /* Iterate over all search strings */
-   for (i = 0; i < ARRAY_SIZE(sigcheck_gpg_status); i++) {
-   const char *found, *next;
-
-   if (!skip_prefix(buf, sigcheck_gpg_status[i].check + 1, 
)) {
-   found = strstr(buf, sigcheck_gpg_status[i].check);
-   if (!found)
-   continue;
-   found += strlen(sigcheck_gpg_status[i].check);
-   }
-   sigc->result = sigcheck_gpg_status[i].result;
-   /* The trust messages are not followed by key/signer 
information */
-   if (sigc->result != 'U') {
-   next = strchrnul(found, ' ');
-   sigc->key = xmemdupz(found, next - found);
-   /* The ERRSIG message is not followed by signer 
information */
-   if (*next && sigc-> result != 'E') {
-   found = next + 1;
-   next = strchrnul(found, '\n');
-   sigc->signer = xmemdupz(found, next - found);
+   int had_exclusive_status = 0;
+
+   /* Iterate over all lines */
+   for (line = buf; *line; line = strchrnul(line+1, '\n')) {
+   while (*line == '\n')
+   line++;
+   /* Skip lines that don't start with GNUPG status */
+   if (strncmp(line, "[GNUPG:] ", 9))
+   continue;
+   line += 9;
+
+   /* Iterate over all search strings */
+   for (i = 0; i < ARRAY_SIZE(sigcheck_gpg_status); i++) {
+   if (!strncmp(line, sigcheck_gpg_status[i].check,
+   strlen(sigcheck_gpg_status[i].check))) {
+   line += strlen(sigcheck_gpg_status[i].check);
+
+   if (sigcheck_gpg_status[i].flags & 
GPG_STATUS_EXCLUSIVE)
+   had_exclusive_status++;
+
+   sigc->result = sigcheck_gpg_status[i].result;
+   /* The trust messages are not followed by 
key/signer information */
+   if (sigc->result != 'U') {
+   next = strchrnul(line, ' ');
+   

[PATCH v2 1/1] zlib.c: use size_t for size

2018-10-12 Thread tboegi
From: Martin Koegler 

Signed-off-by: Martin Koegler 
Signed-off-by: Junio C Hamano 
Signed-off-by: Torsten Bögershausen 
---

After doing a review, I decided to send the result as a patch.
In general, the changes from off_t to size_t seem to be not really
motivated.
But if they are, they could and should go into an own patch.
For the moment, change only "unsigned long" into size_t, thats all

 builtin/pack-objects.c |  8 
 cache.h| 10 +-
 pack-check.c   |  4 ++--
 packfile.h |  2 +-
 wrapper.c  |  8 
 zlib.c |  8 
 6 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index e6316d294d..23c4cd8c77 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -269,12 +269,12 @@ static void copy_pack_data(struct hashfile *f,
off_t len)
 {
unsigned char *in;
-   unsigned long avail;
+   size_t avail;
 
while (len) {
in = use_pack(p, w_curs, offset, );
if (avail > len)
-   avail = (unsigned long)len;
+   avail = xsize_t(len);
hashwrite(f, in, avail);
offset += avail;
len -= avail;
@@ -1478,8 +1478,8 @@ static void check_object(struct object_entry *entry)
struct pack_window *w_curs = NULL;
const unsigned char *base_ref = NULL;
struct object_entry *base_entry;
-   unsigned long used, used_0;
-   unsigned long avail;
+   size_t used, used_0;
+   size_t avail;
off_t ofs;
unsigned char *buf, c;
enum object_type type;
diff --git a/cache.h b/cache.h
index d508f3d4f8..fce53fe620 100644
--- a/cache.h
+++ b/cache.h
@@ -20,10 +20,10 @@
 #include 
 typedef struct git_zstream {
z_stream z;
-   unsigned long avail_in;
-   unsigned long avail_out;
-   unsigned long total_in;
-   unsigned long total_out;
+   size_t avail_in;
+   size_t avail_out;
+   size_t total_in;
+   size_t total_out;
unsigned char *next_in;
unsigned char *next_out;
 } git_zstream;
@@ -40,7 +40,7 @@ void git_deflate_end(git_zstream *);
 int git_deflate_abort(git_zstream *);
 int git_deflate_end_gently(git_zstream *);
 int git_deflate(git_zstream *, int flush);
-unsigned long git_deflate_bound(git_zstream *, unsigned long);
+size_t git_deflate_bound(git_zstream *, size_t);
 
 /* The length in bytes and in hex digits of an object name (SHA-1 value). */
 #define GIT_SHA1_RAWSZ 20
diff --git a/pack-check.c b/pack-check.c
index fa5f0ff8fa..d1e7f554ae 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -33,7 +33,7 @@ int check_pack_crc(struct packed_git *p, struct pack_window 
**w_curs,
uint32_t data_crc = crc32(0, NULL, 0);
 
do {
-   unsigned long avail;
+   size_t avail;
void *data = use_pack(p, w_curs, offset, );
if (avail > len)
avail = len;
@@ -68,7 +68,7 @@ static int verify_packfile(struct packed_git *p,
 
the_hash_algo->init_fn();
do {
-   unsigned long remaining;
+   size_t remaining;
unsigned char *in = use_pack(p, w_curs, offset, );
offset += remaining;
if (!pack_sig_ofs)
diff --git a/packfile.h b/packfile.h
index 442625723d..e2daf63426 100644
--- a/packfile.h
+++ b/packfile.h
@@ -78,7 +78,7 @@ extern void close_pack_index(struct packed_git *);
 
 extern uint32_t get_pack_fanout(struct packed_git *p, uint32_t value);
 
-extern unsigned char *use_pack(struct packed_git *, struct pack_window **, 
off_t, unsigned long *);
+extern unsigned char *use_pack(struct packed_git *, struct pack_window **, 
off_t, size_t *);
 extern void close_pack_windows(struct packed_git *);
 extern void close_pack(struct packed_git *);
 extern void close_all_packs(struct raw_object_store *o);
diff --git a/wrapper.c b/wrapper.c
index e4fa9d84cd..1a510bd6fc 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -67,11 +67,11 @@ static void *do_xmalloc(size_t size, int gentle)
ret = malloc(1);
if (!ret) {
if (!gentle)
-   die("Out of memory, malloc failed (tried to 
allocate %lu bytes)",
-   (unsigned long)size);
+   die("Out of memory, malloc failed (tried to 
allocate %" PRIuMAX " bytes)",
+   (uintmax_t)size);
else {
-   error("Out of memory, malloc failed (tried to 
allocate %lu bytes)",
- (unsigned long)size);
+   error("Out of memory, malloc failed (tried to 
allocate %" PRIuMAX " bytes)",
+

[PATCH v12 8/8] list-objects-filter: implement filter tree:0

2018-10-12 Thread Matthew DeVore
Teach list-objects the "tree:0" filter which allows for filtering
out all tree and blob objects (unless other objects are explicitly
specified by the user). The purpose of this patch is to allow smaller
partial clones.

The name of this filter - tree:0 - does not explicitly specify that
it also filters out all blobs, but this should not cause much confusion
because blobs are not at all useful without the trees that refer to
them.

I also considered only:commits as a name, but this is inaccurate because
it suggests that annotated tags are omitted, but actually they are
included.

The name "tree:0" allows later filtering based on depth, i.e. "tree:1"
would filter out all but the root tree and blobs. In order to avoid
confusion between 0 and capital O, the documentation was worded in a
somewhat round-about way that also hints at this future improvement to
the feature.

Signed-off-by: Matthew DeVore 
---
 Documentation/rev-list-options.txt |  5 +++
 list-objects-filter-options.c  | 13 +++
 list-objects-filter-options.h  |  1 +
 list-objects-filter.c  | 49 ++
 t/t5317-pack-objects-filter-objects.sh | 28 +++
 t/t5616-partial-clone.sh   | 42 ++
 t/t6112-rev-list-filters-objects.sh| 15 
 7 files changed, 153 insertions(+)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 7b273635d..5f1672913 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -731,6 +731,11 @@ the requested refs.
 +
 The form '--filter=sparse:path=' similarly uses a sparse-checkout
 specification contained in .
++
+The form '--filter=tree:' omits all blobs and trees whose depth
+from the root tree is >=  (minimum depth if an object is located
+at multiple depths in the commits traversed). Currently, only =0
+is supported, which omits all blobs and trees.
 
 --no-filter::
Turn off any previous `--filter=` argument.
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index d259bdb2c..e8da2e858 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -49,6 +49,19 @@ static int gently_parse_list_objects_filter(
return 0;
}
 
+   } else if (skip_prefix(arg, "tree:", )) {
+   unsigned long depth;
+   if (!git_parse_ulong(v0, ) || depth != 0) {
+   if (errbuf) {
+   strbuf_addstr(
+   errbuf,
+   _("only 'tree:0' is supported"));
+   }
+   return 1;
+   }
+   filter_options->choice = LOFC_TREE_NONE;
+   return 0;
+
} else if (skip_prefix(arg, "sparse:oid=", )) {
struct object_context oc;
struct object_id sparse_oid;
diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h
index a61f8..af64e5c66 100644
--- a/list-objects-filter-options.h
+++ b/list-objects-filter-options.h
@@ -10,6 +10,7 @@ enum list_objects_filter_choice {
LOFC_DISABLED = 0,
LOFC_BLOB_NONE,
LOFC_BLOB_LIMIT,
+   LOFC_TREE_NONE,
LOFC_SPARSE_OID,
LOFC_SPARSE_PATH,
LOFC__COUNT /* must be last */
diff --git a/list-objects-filter.c b/list-objects-filter.c
index 5f8b1a002..09b2b05d5 100644
--- a/list-objects-filter.c
+++ b/list-objects-filter.c
@@ -79,6 +79,54 @@ static void *filter_blobs_none__init(
return d;
 }
 
+/*
+ * A filter for list-objects to omit ALL trees and blobs from the traversal.
+ * Can OPTIONALLY collect a list of the omitted OIDs.
+ */
+struct filter_trees_none_data {
+   struct oidset *omits;
+};
+
+static enum list_objects_filter_result filter_trees_none(
+   enum list_objects_filter_situation filter_situation,
+   struct object *obj,
+   const char *pathname,
+   const char *filename,
+   void *filter_data_)
+{
+   struct filter_trees_none_data *filter_data = filter_data_;
+
+   switch (filter_situation) {
+   default:
+   BUG("unknown filter_situation: %d", filter_situation);
+
+   case LOFS_BEGIN_TREE:
+   case LOFS_BLOB:
+   if (filter_data->omits)
+   oidset_insert(filter_data->omits, >oid);
+   return LOFR_MARK_SEEN; /* but not LOFR_DO_SHOW (hard omit) */
+
+   case LOFS_END_TREE:
+   assert(obj->type == OBJ_TREE);
+   return LOFR_ZERO;
+
+   }
+}
+
+static void* filter_trees_none__init(
+   struct oidset *omitted,
+   struct list_objects_filter_options *filter_options,
+   filter_object_fn *filter_fn,
+   filter_free_fn *filter_free_fn)
+{
+   struct filter_trees_none_data *d = xcalloc(1, sizeof(*d));
+   d->omits = omitted;
+
+   *filter_fn = filter_trees_none;
+   

[PATCH v12 7/8] list-objects-filter-options: do not over-strbuf_init

2018-10-12 Thread Matthew DeVore
The function gently_parse_list_objects_filter is either called with
errbuf=STRBUF_INIT or errbuf=NULL, but that function calls strbuf_init
when errbuf is not NULL. strbuf_init is only necessary if errbuf
contains garbage, and risks a memory leak if errbuf already has a
non-STRBUF_INIT state. It should be the caller's responsibility to make
sure errbuf is not garbage, since garbage content is easily avoidable
with STRBUF_INIT.

Signed-off-by: Matthew DeVore 
---
 list-objects-filter-options.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index c0e2bd6a0..d259bdb2c 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -30,7 +30,6 @@ static int gently_parse_list_objects_filter(
 
if (filter_options->choice) {
if (errbuf) {
-   strbuf_init(errbuf, 0);
strbuf_addstr(
errbuf,
_("multiple filter-specs cannot be combined"));
@@ -71,10 +70,9 @@ static int gently_parse_list_objects_filter(
return 0;
}
 
-   if (errbuf) {
-   strbuf_init(errbuf, 0);
+   if (errbuf)
strbuf_addf(errbuf, "invalid filter-spec '%s'", arg);
-   }
+
memset(filter_options, 0, sizeof(*filter_options));
return 1;
 }
-- 
2.19.1.331.ge82ca0e54c-goog



[PATCH v12 4/8] rev-list: handle missing tree objects properly

2018-10-12 Thread Matthew DeVore
Previously, we assumed only blob objects could be missing. This patch
makes rev-list handle missing trees like missing blobs. The --missing=*
and --exclude-promisor-objects flags now work for trees as they already
do for blobs. This is demonstrated in t6112.

Signed-off-by: Matthew DeVore 
---
 builtin/rev-list.c | 11 ---
 list-objects.c | 11 +--
 revision.h | 15 +
 t/t0410-partial-clone.sh   | 45 ++
 t/t5317-pack-objects-filter-objects.sh | 13 
 t/t6112-rev-list-filters-objects.sh| 22 +
 6 files changed, 110 insertions(+), 7 deletions(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 5b07f3f4a..49d6deed7 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -6,6 +6,7 @@
 #include "list-objects.h"
 #include "list-objects-filter.h"
 #include "list-objects-filter-options.h"
+#include "object.h"
 #include "object-store.h"
 #include "pack.h"
 #include "pack-bitmap.h"
@@ -209,7 +210,8 @@ static inline void finish_object__ma(struct object *obj)
 */
switch (arg_missing_action) {
case MA_ERROR:
-   die("missing blob object '%s'", oid_to_hex(>oid));
+   die("missing %s object '%s'",
+   type_name(obj->type), oid_to_hex(>oid));
return;
 
case MA_ALLOW_ANY:
@@ -222,8 +224,8 @@ static inline void finish_object__ma(struct object *obj)
case MA_ALLOW_PROMISOR:
if (is_promisor_object(>oid))
return;
-   die("unexpected missing blob object '%s'",
-   oid_to_hex(>oid));
+   die("unexpected missing %s object '%s'",
+   type_name(obj->type), oid_to_hex(>oid));
return;
 
default:
@@ -235,7 +237,7 @@ static inline void finish_object__ma(struct object *obj)
 static int finish_object(struct object *obj, const char *name, void *cb_data)
 {
struct rev_list_info *info = cb_data;
-   if (obj->type == OBJ_BLOB && !has_object_file(>oid)) {
+   if (!has_object_file(>oid)) {
finish_object__ma(obj);
return 1;
}
@@ -373,6 +375,7 @@ int cmd_rev_list(int argc, const char **argv, const char 
*prefix)
init_revisions(, prefix);
revs.abbrev = DEFAULT_ABBREV;
revs.commit_format = CMIT_FMT_UNSPECIFIED;
+   revs.do_not_die_on_missing_tree = 1;
 
/*
 * Scan the argument list before invoking setup_revisions(), so that we
diff --git a/list-objects.c b/list-objects.c
index f9b51db7a..243192af5 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -143,6 +143,7 @@ static void process_tree(struct traversal_context *ctx,
struct rev_info *revs = ctx->revs;
int baselen = base->len;
enum list_objects_filter_result r = LOFR_MARK_SEEN | LOFR_DO_SHOW;
+   int failed_parse;
 
if (!revs->tree_objects)
return;
@@ -150,7 +151,9 @@ static void process_tree(struct traversal_context *ctx,
die("bad tree object");
if (obj->flags & (UNINTERESTING | SEEN))
return;
-   if (parse_tree_gently(tree, 1) < 0) {
+
+   failed_parse = parse_tree_gently(tree, 1);
+   if (failed_parse) {
if (revs->ignore_missing_links)
return;
 
@@ -163,7 +166,8 @@ static void process_tree(struct traversal_context *ctx,
is_promisor_object(>oid))
return;
 
-   die("bad tree object %s", oid_to_hex(>oid));
+   if (!revs->do_not_die_on_missing_tree)
+   die("bad tree object %s", oid_to_hex(>oid));
}
 
strbuf_addstr(base, name);
@@ -178,7 +182,8 @@ static void process_tree(struct traversal_context *ctx,
if (base->len)
strbuf_addch(base, '/');
 
-   process_tree_contents(ctx, tree, base);
+   if (!failed_parse)
+   process_tree_contents(ctx, tree, base);
 
if (!(obj->flags & USER_GIVEN) && ctx->filter_fn) {
r = ctx->filter_fn(LOFS_END_TREE, obj,
diff --git a/revision.h b/revision.h
index 007278cc1..5910613cb 100644
--- a/revision.h
+++ b/revision.h
@@ -126,6 +126,21 @@ struct rev_info {
line_level_traverse:1,
tree_blobs_in_commit_order:1,
 
+   /*
+* Blobs are shown without regard for their existence.
+* But not so for trees: unless exclude_promisor_objects
+* is set and the tree in question is a promisor object;
+* OR ignore_missing_links is set, the revision walker
+* dies with a "bad tree object HASH" message when
+* encountering a missing tree. For callers that can
+* handle missing trees 

[PATCH v12 3/8] list-objects: always parse trees gently

2018-10-12 Thread Matthew DeVore
If parsing fails when revs->ignore_missing_links and
revs->exclude_promisor_objects are both false, we print the OID anyway
in the die("bad tree object...") call, so any message printed by
parse_tree_gently() is superfluous.

Signed-off-by: Matthew DeVore 
---
 list-objects.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/list-objects.c b/list-objects.c
index ccc529e5e..f9b51db7a 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -143,8 +143,6 @@ static void process_tree(struct traversal_context *ctx,
struct rev_info *revs = ctx->revs;
int baselen = base->len;
enum list_objects_filter_result r = LOFR_MARK_SEEN | LOFR_DO_SHOW;
-   int gently = revs->ignore_missing_links ||
-revs->exclude_promisor_objects;
 
if (!revs->tree_objects)
return;
@@ -152,7 +150,7 @@ static void process_tree(struct traversal_context *ctx,
die("bad tree object");
if (obj->flags & (UNINTERESTING | SEEN))
return;
-   if (parse_tree_gently(tree, gently) < 0) {
+   if (parse_tree_gently(tree, 1) < 0) {
if (revs->ignore_missing_links)
return;
 
-- 
2.19.1.331.ge82ca0e54c-goog



[PATCH v12 5/8] revision: mark non-user-given objects instead

2018-10-12 Thread Matthew DeVore
Currently, list-objects.c incorrectly treats all root trees of commits
as USER_GIVEN. Also, it would be easier to mark objects that are
non-user-given instead of user-given, since the places in the code
where we access an object through a reference are more obvious than
the places where we access an object that was given by the user.

Resolve these two problems by introducing a flag NOT_USER_GIVEN that
marks blobs and trees that are non-user-given, replacing USER_GIVEN.
(Only blobs and trees are marked because this mark is only used when
filtering objects, and filtering of other types of objects is not
supported yet.)

This fixes a bug in that git rev-list behaved differently from git
pack-objects. pack-objects would *not* filter objects given explicitly
on the command line and rev-list would filter. This was because the two
commands used a different function to add objects to the rev_info
struct. This seems to have been an oversight, and pack-objects has the
correct behavior, so I added a test to make sure that rev-list now
behaves properly.

Signed-off-by: Matthew DeVore 
---
 list-objects.c  | 31 +
 revision.c  |  1 -
 revision.h  | 11 --
 t/t6112-rev-list-filters-objects.sh | 12 +++
 4 files changed, 39 insertions(+), 16 deletions(-)

diff --git a/list-objects.c b/list-objects.c
index 243192af5..7a1a0929d 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -53,7 +53,7 @@ static void process_blob(struct traversal_context *ctx,
 
pathlen = path->len;
strbuf_addstr(path, name);
-   if (!(obj->flags & USER_GIVEN) && ctx->filter_fn)
+   if ((obj->flags & NOT_USER_GIVEN) && ctx->filter_fn)
r = ctx->filter_fn(LOFS_BLOB, obj,
   path->buf, >buf[pathlen],
   ctx->filter_data);
@@ -120,17 +120,19 @@ static void process_tree_contents(struct 
traversal_context *ctx,
continue;
}
 
-   if (S_ISDIR(entry.mode))
-   process_tree(ctx,
-lookup_tree(the_repository, entry.oid),
-base, entry.path);
+   if (S_ISDIR(entry.mode)) {
+   struct tree *t = lookup_tree(the_repository, entry.oid);
+   t->object.flags |= NOT_USER_GIVEN;
+   process_tree(ctx, t, base, entry.path);
+   }
else if (S_ISGITLINK(entry.mode))
process_gitlink(ctx, entry.oid->hash,
base, entry.path);
-   else
-   process_blob(ctx,
-lookup_blob(the_repository, entry.oid),
-base, entry.path);
+   else {
+   struct blob *b = lookup_blob(the_repository, entry.oid);
+   b->object.flags |= NOT_USER_GIVEN;
+   process_blob(ctx, b, base, entry.path);
+   }
}
 }
 
@@ -171,7 +173,7 @@ static void process_tree(struct traversal_context *ctx,
}
 
strbuf_addstr(base, name);
-   if (!(obj->flags & USER_GIVEN) && ctx->filter_fn)
+   if ((obj->flags & NOT_USER_GIVEN) && ctx->filter_fn)
r = ctx->filter_fn(LOFS_BEGIN_TREE, obj,
   base->buf, >buf[baselen],
   ctx->filter_data);
@@ -185,7 +187,7 @@ static void process_tree(struct traversal_context *ctx,
if (!failed_parse)
process_tree_contents(ctx, tree, base);
 
-   if (!(obj->flags & USER_GIVEN) && ctx->filter_fn) {
+   if ((obj->flags & NOT_USER_GIVEN) && ctx->filter_fn) {
r = ctx->filter_fn(LOFS_END_TREE, obj,
   base->buf, >buf[baselen],
   ctx->filter_data);
@@ -301,8 +303,11 @@ static void do_traverse(struct traversal_context *ctx)
 * an uninteresting boundary commit may not have its tree
 * parsed yet, but we are not going to show them anyway
 */
-   if (get_commit_tree(commit))
-   add_pending_tree(ctx->revs, get_commit_tree(commit));
+   if (get_commit_tree(commit)) {
+   struct tree *tree = get_commit_tree(commit);
+   tree->object.flags |= NOT_USER_GIVEN;
+   add_pending_tree(ctx->revs, tree);
+   }
ctx->show_commit(commit, ctx->show_data);
 
if (ctx->revs->tree_blobs_in_commit_order)
diff --git a/revision.c b/revision.c
index de4dce600..72d48a17f 100644
--- a/revision.c
+++ b/revision.c
@@ -175,7 +175,6 @@ static void add_pending_object_with_path(struct rev_info 
*revs,

[PATCH v12 6/8] list-objects-filter: use BUG rather than die

2018-10-12 Thread Matthew DeVore
In some cases in this file, BUG makes more sense than die. In such
cases, a we get there from a coding error rather than a user error.

'return' has been removed following some instances of BUG since BUG does
not return.

Signed-off-by: Matthew DeVore 
---
 list-objects-filter.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/list-objects-filter.c b/list-objects-filter.c
index a0ba78b20..5f8b1a002 100644
--- a/list-objects-filter.c
+++ b/list-objects-filter.c
@@ -44,8 +44,7 @@ static enum list_objects_filter_result filter_blobs_none(
 
switch (filter_situation) {
default:
-   die("unknown filter_situation");
-   return LOFR_ZERO;
+   BUG("unknown filter_situation: %d", filter_situation);
 
case LOFS_BEGIN_TREE:
assert(obj->type == OBJ_TREE);
@@ -102,8 +101,7 @@ static enum list_objects_filter_result filter_blobs_limit(
 
switch (filter_situation) {
default:
-   die("unknown filter_situation");
-   return LOFR_ZERO;
+   BUG("unknown filter_situation: %d", filter_situation);
 
case LOFS_BEGIN_TREE:
assert(obj->type == OBJ_TREE);
@@ -208,8 +206,7 @@ static enum list_objects_filter_result filter_sparse(
 
switch (filter_situation) {
default:
-   die("unknown filter_situation");
-   return LOFR_ZERO;
+   BUG("unknown filter_situation: %d", filter_situation);
 
case LOFS_BEGIN_TREE:
assert(obj->type == OBJ_TREE);
@@ -389,7 +386,7 @@ void *list_objects_filter__init(
assert((sizeof(s_filters) / sizeof(s_filters[0])) == LOFC__COUNT);
 
if (filter_options->choice >= LOFC__COUNT)
-   die("invalid list-objects filter choice: %d",
+   BUG("invalid list-objects filter choice: %d",
filter_options->choice);
 
init_fn = s_filters[filter_options->choice];
-- 
2.19.1.331.ge82ca0e54c-goog



[PATCH v12 0/8] filter: support for excluding all trees and blobs

2018-10-12 Thread Matthew DeVore
Here is a re-roll-up since I haven't received any additional corrections for
almost a week. The changes are very slight and just for clean-up so it is ready
to be promoted to master.

This is the interdiff from last time:

diff --git a/t/t5317-pack-objects-filter-objects.sh 
b/t/t5317-pack-objects-filter-objects.sh
index 510d3537f..d9dccf4d4 100755
--- a/t/t5317-pack-objects-filter-objects.sh
+++ b/t/t5317-pack-objects-filter-objects.sh
@@ -69,7 +69,7 @@ test_expect_success 'get an error for missing tree object' '
 test_must_fail git -C r5 pack-objects --rev --stdout 2>bad_tree <<-EOF 
&&
 HEAD
 EOF
-grep -q "bad tree object" bad_tree
+grep "bad tree object" bad_tree
 '
 
 test_expect_success 'setup for tests of tree:0' '
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 53fbf7db8..392caa08f 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -192,7 +192,7 @@ test_expect_success 'use fsck before and after manually 
fetching a missing subtr
 xargs -n1 git -C dst cat-file -t >fetched_types &&
 
 sort -u fetched_types >unique_types.observed &&
-printf "blob\ncommit\ntree\n" >unique_types.expected &&
+test_write_lines blob commit tree >unique_types.expected &&
 test_cmp unique_types.expected unique_types.observed
 '
 
diff --git a/t/t6112-rev-list-filters-objects.sh 
b/t/t6112-rev-list-filters-objects.sh
index c8e3d87c4..08e0c7db6 100755
--- a/t/t6112-rev-list-filters-objects.sh
+++ b/t/t6112-rev-list-filters-objects.sh
@@ -38,8 +38,8 @@ test_expect_success 'specify blob explicitly prevents 
filtering' '
  awk -f print_2.awk) &&
 
 git -C r1 rev-list --objects --filter=blob:none HEAD $file_3 >observed 
&&
-grep -q "$file_3" observed &&
-test_must_fail grep -q "$file_4" observed
+grep "$file_3" observed &&
+! grep "$file_4" observed
 '
 
 test_expect_success 'verify emitted+omitted == all' '
@@ -240,7 +240,7 @@ test_expect_success 'verify tree:0 includes trees in 
"filtered" output' '
 xargs -n1 git -C r3 cat-file -t >unsorted_filtered_types &&
 
 sort -u unsorted_filtered_types >filtered_types &&
-printf "blob\ntree\n" >expected &&
+test_write_lines blob tree >expected &&
 test_cmp expected filtered_types
 '
 

Thanks,

Matthew DeVore (8):
  list-objects: store common func args in struct
  list-objects: refactor to process_tree_contents
  list-objects: always parse trees gently
  rev-list: handle missing tree objects properly
  revision: mark non-user-given objects instead
  list-objects-filter: use BUG rather than die
  list-objects-filter-options: do not over-strbuf_init
  list-objects-filter: implement filter tree:0

 Documentation/rev-list-options.txt |   5 +
 builtin/rev-list.c |  11 +-
 list-objects-filter-options.c  |  19 +-
 list-objects-filter-options.h  |   1 +
 list-objects-filter.c  |  60 ++-
 list-objects.c | 232 +
 revision.c |   1 -
 revision.h |  26 ++-
 t/t0410-partial-clone.sh   |  45 +
 t/t5317-pack-objects-filter-objects.sh |  41 +
 t/t5616-partial-clone.sh   |  42 +
 t/t6112-rev-list-filters-objects.sh|  49 ++
 12 files changed, 404 insertions(+), 128 deletions(-)

-- 
2.19.1.331.ge82ca0e54c-goog



[PATCH v12 1/8] list-objects: store common func args in struct

2018-10-12 Thread Matthew DeVore
This will make utility functions easier to create, as done by the next
patch.

Signed-off-by: Matthew DeVore 
---
 list-objects.c | 158 +++--
 1 file changed, 74 insertions(+), 84 deletions(-)

diff --git a/list-objects.c b/list-objects.c
index c99c47ac1..584518a3f 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -12,20 +12,25 @@
 #include "packfile.h"
 #include "object-store.h"
 
-static void process_blob(struct rev_info *revs,
+struct traversal_context {
+   struct rev_info *revs;
+   show_object_fn show_object;
+   show_commit_fn show_commit;
+   void *show_data;
+   filter_object_fn filter_fn;
+   void *filter_data;
+};
+
+static void process_blob(struct traversal_context *ctx,
 struct blob *blob,
-show_object_fn show,
 struct strbuf *path,
-const char *name,
-void *cb_data,
-filter_object_fn filter_fn,
-void *filter_data)
+const char *name)
 {
struct object *obj = >object;
size_t pathlen;
enum list_objects_filter_result r = LOFR_MARK_SEEN | LOFR_DO_SHOW;
 
-   if (!revs->blob_objects)
+   if (!ctx->revs->blob_objects)
return;
if (!obj)
die("bad blob object");
@@ -41,21 +46,21 @@ static void process_blob(struct rev_info *revs,
 * may cause the actual filter to report an incomplete list
 * of missing objects.
 */
-   if (revs->exclude_promisor_objects &&
+   if (ctx->revs->exclude_promisor_objects &&
!has_object_file(>oid) &&
is_promisor_object(>oid))
return;
 
pathlen = path->len;
strbuf_addstr(path, name);
-   if (!(obj->flags & USER_GIVEN) && filter_fn)
-   r = filter_fn(LOFS_BLOB, obj,
- path->buf, >buf[pathlen],
- filter_data);
+   if (!(obj->flags & USER_GIVEN) && ctx->filter_fn)
+   r = ctx->filter_fn(LOFS_BLOB, obj,
+  path->buf, >buf[pathlen],
+  ctx->filter_data);
if (r & LOFR_MARK_SEEN)
obj->flags |= SEEN;
if (r & LOFR_DO_SHOW)
-   show(obj, path->buf, cb_data);
+   ctx->show_object(obj, path->buf, ctx->show_data);
strbuf_setlen(path, pathlen);
 }
 
@@ -81,26 +86,21 @@ static void process_blob(struct rev_info *revs,
  * the link, and how to do it. Whether it necessarily makes
  * any sense what-so-ever to ever do that is another issue.
  */
-static void process_gitlink(struct rev_info *revs,
+static void process_gitlink(struct traversal_context *ctx,
const unsigned char *sha1,
-   show_object_fn show,
struct strbuf *path,
-   const char *name,
-   void *cb_data)
+   const char *name)
 {
/* Nothing to do */
 }
 
-static void process_tree(struct rev_info *revs,
+static void process_tree(struct traversal_context *ctx,
 struct tree *tree,
-show_object_fn show,
 struct strbuf *base,
-const char *name,
-void *cb_data,
-filter_object_fn filter_fn,
-void *filter_data)
+const char *name)
 {
struct object *obj = >object;
+   struct rev_info *revs = ctx->revs;
struct tree_desc desc;
struct name_entry entry;
enum interesting match = revs->diffopt.pathspec.nr == 0 ?
@@ -133,14 +133,14 @@ static void process_tree(struct rev_info *revs,
}
 
strbuf_addstr(base, name);
-   if (!(obj->flags & USER_GIVEN) && filter_fn)
-   r = filter_fn(LOFS_BEGIN_TREE, obj,
- base->buf, >buf[baselen],
- filter_data);
+   if (!(obj->flags & USER_GIVEN) && ctx->filter_fn)
+   r = ctx->filter_fn(LOFS_BEGIN_TREE, obj,
+  base->buf, >buf[baselen],
+  ctx->filter_data);
if (r & LOFR_MARK_SEEN)
obj->flags |= SEEN;
if (r & LOFR_DO_SHOW)
-   show(obj, base->buf, cb_data);
+   ctx->show_object(obj, base->buf, ctx->show_data);
if (base->len)
strbuf_addch(base, '/');
 
@@ -157,29 +157,25 @@ static void process_tree(struct rev_info *revs,
}
 
if (S_ISDIR(entry.mode))
-   process_tree(revs,
+   process_tree(ctx,
 lookup_tree(the_repository, entry.oid),
-   

[PATCH v12 2/8] list-objects: refactor to process_tree_contents

2018-10-12 Thread Matthew DeVore
This will be used in a follow-up patch to reduce indentation needed when
invoking the logic conditionally. i.e. rather than:

if (foo) {
while (...) {
/* this is very indented */
}
}

we will have:

if (foo)
process_tree_contents(...);

Signed-off-by: Matthew DeVore 
---
 list-objects.c | 68 ++
 1 file changed, 41 insertions(+), 27 deletions(-)

diff --git a/list-objects.c b/list-objects.c
index 584518a3f..ccc529e5e 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -94,6 +94,46 @@ static void process_gitlink(struct traversal_context *ctx,
/* Nothing to do */
 }
 
+static void process_tree(struct traversal_context *ctx,
+struct tree *tree,
+struct strbuf *base,
+const char *name);
+
+static void process_tree_contents(struct traversal_context *ctx,
+ struct tree *tree,
+ struct strbuf *base)
+{
+   struct tree_desc desc;
+   struct name_entry entry;
+   enum interesting match = ctx->revs->diffopt.pathspec.nr == 0 ?
+   all_entries_interesting : entry_not_interesting;
+
+   init_tree_desc(, tree->buffer, tree->size);
+
+   while (tree_entry(, )) {
+   if (match != all_entries_interesting) {
+   match = tree_entry_interesting(, base, 0,
+  
>revs->diffopt.pathspec);
+   if (match == all_entries_not_interesting)
+   break;
+   if (match == entry_not_interesting)
+   continue;
+   }
+
+   if (S_ISDIR(entry.mode))
+   process_tree(ctx,
+lookup_tree(the_repository, entry.oid),
+base, entry.path);
+   else if (S_ISGITLINK(entry.mode))
+   process_gitlink(ctx, entry.oid->hash,
+   base, entry.path);
+   else
+   process_blob(ctx,
+lookup_blob(the_repository, entry.oid),
+base, entry.path);
+   }
+}
+
 static void process_tree(struct traversal_context *ctx,
 struct tree *tree,
 struct strbuf *base,
@@ -101,10 +141,6 @@ static void process_tree(struct traversal_context *ctx,
 {
struct object *obj = >object;
struct rev_info *revs = ctx->revs;
-   struct tree_desc desc;
-   struct name_entry entry;
-   enum interesting match = revs->diffopt.pathspec.nr == 0 ?
-   all_entries_interesting: entry_not_interesting;
int baselen = base->len;
enum list_objects_filter_result r = LOFR_MARK_SEEN | LOFR_DO_SHOW;
int gently = revs->ignore_missing_links ||
@@ -144,29 +180,7 @@ static void process_tree(struct traversal_context *ctx,
if (base->len)
strbuf_addch(base, '/');
 
-   init_tree_desc(, tree->buffer, tree->size);
-
-   while (tree_entry(, )) {
-   if (match != all_entries_interesting) {
-   match = tree_entry_interesting(, base, 0,
-  >diffopt.pathspec);
-   if (match == all_entries_not_interesting)
-   break;
-   if (match == entry_not_interesting)
-   continue;
-   }
-
-   if (S_ISDIR(entry.mode))
-   process_tree(ctx,
-lookup_tree(the_repository, entry.oid),
-base, entry.path);
-   else if (S_ISGITLINK(entry.mode))
-   process_gitlink(ctx, entry.oid->hash, base, entry.path);
-   else
-   process_blob(ctx,
-lookup_blob(the_repository, entry.oid),
-base, entry.path);
-   }
+   process_tree_contents(ctx, tree, base);
 
if (!(obj->flags & USER_GIVEN) && ctx->filter_fn) {
r = ctx->filter_fn(LOFS_END_TREE, obj,
-- 
2.19.1.331.ge82ca0e54c-goog



Re: What's cooking in git.git (Oct 2018, #02; Sat, 13)

2018-10-12 Thread Stefan Beller
>
> * sb/strbuf-h-update (2018-09-29) 1 commit
>  - strbuf.h: format according to coding guidelines
>
>  Code clean-up to serve as a BCP example.
>
>  What's the status of this one after the discussion thread stopped here?
>  cf. 

I started rewriting the documentation according to your proposal of having
parameters with name, then referred to as NAME in the docs.

After a few examples, I must admit I do not like that style,
so I would just want to do the minimal part that would get this patch landed,
i.e.
(a) convince you that the patch is good as-is or
(b) resend with fewer parameters made explicit if
we desire to be concise instead.

> * nd/the-index (2018-09-21) 23 commits
>   (merged to 'next' on 2018-10-10 at 16e2e2e947)
[...]
>  (this branch is used by sb/more-repo-in-api.)
>
>  Various codepaths in the core-ish part learn to work on an
>  arbitrary in-core index structure, not necessarily the default
>  instance "the_index".
>
>  Will merge to 'master'.

Cool!

sb/more-repo-in-api is not part of this message,
but it requires at least one resend, so I'll do that.


> * sb/submodule-recursive-fetch-gets-the-tip (2018-10-11) 9 commits
>  . builtin/fetch: check for submodule updates for non branch fetches
>  . fetch: retry fetching submodules if needed objects were not fetched
>  . submodule: fetch in submodules git directory instead of in worktree
>  . repository: repo_submodule_init to take a submodule struct
>  . submodule.c: do not copy around submodule list
>  . submodule.c: move global changed_submodule_names into fetch submodule 
> struct
>  . submodule.c: sort changed_submodule_names before searching it
>  . submodule.c: fix indentation
>  . sha1-array: provide oid_array_filter
>
>  "git fetch --recurse-submodules" may not fetch the necessary commit
>  that is bound to the superproject, which is getting corrected.
>
>  Ejected for now, as it has fallouts in places like t/helper/.

This is the first time I hear about that, I'll look into that.
The tipmost commit there is also shoddy, I'll redo that.


Re: [PATCH] config.mak.dev: add -Wformat

2018-10-12 Thread Thomas Gummerer
On 10/12, Jonathan Nieder wrote:
> Jeff King wrote:
> > On Fri, Oct 12, 2018 at 07:40:37PM +0100, Thomas Gummerer wrote:
> 
> >> 801fa63a90 ("config.mak.dev: add -Wformat-security", 2018-09-08) added
> >> the -Wformat-security to the flags set in config.mak.dev.  In the gcc
> >> man page this is documented as:
> >>
> >>  If -Wformat is specified, also warn about uses of format
> >>  functions that represent possible security problems.  [...]
> >>
> >> That commit did however not add the -Wformat flag, and -Wformat is not
> >> specified anywhere else by default, so the added -Wformat-security had
> >> no effect.  Newer versions of gcc (gcc 8.2.1 in this particular case)
> >> warn about this and thus compilation fails with this option set.
> [...]
> > -Wformat is part of -Wall, which we already turn on by default (even for
> > non-developer builds).
> >
> > So I don't think we need to do anything more, though I'm puzzled that
> > you saw a failure. Do you set CFLAGS explicitly in your config.mak to
> > something that doesn't include -Wall?

Whoops embarrassing.  I had this set in my config.mak:

CFLAGS = -O$(O) -g $(EXTRA_CFLAGS)

What happened is that I had included -Wall in an old config.mak that I
copied from Thomas Rast when I started with my GSoC project.  Then
when "DEVELOPER=1" came around I switched to that at some point and
just removed everything from CFLAGS, except the possibility to
override the optimization level, the ability to add extra flags and
including debug symbols, but failed to notice that I had lost -Wall.

Maybe it would still be a good to add -Wall to avoid the surprise for
others.  But then again if someone overrides CFLAGS they should at
least check better what they're overriding ;)

> Thomas, do you use autoconf to generate config.mak.autogen?  I'm
> wondering if that produces a CFLAGS that doesn't include -Wall.

No, this was all my mistake :)

> > I'm not opposed to making config.mak.dev a bit more redundant to handle
> > this case, but we'd probably want to include all of -Wall, since it
> > contains many other warnings we'd want to make sure are enabled.
> 
> Do you mean putting -Wall instead of -Wformat?
> 
> Should we add -Wextra too?  From a quick test, it seems to build okay.

We do have that with setting DEVELOPER=extra-all.

> Thanks,
> Jonathan


Re: [PATCH] config.mak.dev: add -Wformat

2018-10-12 Thread Jeff King
On Fri, Oct 12, 2018 at 11:54:50AM -0700, Jonathan Nieder wrote:

> > I'm not opposed to making config.mak.dev a bit more redundant to handle
> > this case, but we'd probably want to include all of -Wall, since it
> > contains many other warnings we'd want to make sure are enabled.
> 
> Do you mean putting -Wall instead of -Wformat?

Yes.

> Should we add -Wextra too?  From a quick test, it seems to build okay.

We already do (though we have to then manually disable a few warnings
that we're not ready for -- see config.mak.dev).

-Peff


Re: [PATCH 3/3] ref-filter: free item->value and item->value->s

2018-10-12 Thread Jeff King
On Thu, Oct 11, 2018 at 10:19:22AM +0900, Junio C Hamano wrote:

> > @@ -1373,36 +1379,31 @@ static void fill_remote_ref_details(struct 
> > used_atom *atom, const char *refname,
> > }
> > } else if (atom->u.remote_ref.option == RR_TRACKSHORT) {
> > if (stat_tracking_info(branch, _ours, _theirs,
> > -  NULL, AHEAD_BEHIND_FULL) < 0)
> > +  NULL, AHEAD_BEHIND_FULL) < 0) {
> > +   *s = xstrdup("");
> > return;
> 
> It is a bit sad that we need to sprinkle xstrdup() all over the
> place, but I do not offhand think of a better alternative to ensure
> that it is safe to blindly free the .s field.

I think the root of the problem is that the current code needs
_something_ in the "s" field to know that the value has already been
filled in.

If there were a separate flag for "filled", then this could just be
NULL (and the later code that builds the output would have to realize
to replace that with an empty string).

I think in the long run, we'd ideally have one of two code structures:

  - a single pass, without iterating over the used atoms list
repeatedly. E.g., the way oid_object_info() takes a struct
representing the set of items that the caller is interested in, and
then fills it in as it digs for information.

  - individual atoms could write directly to an output strbuf, avoiding
the extra allocation and copy altogether (that would help these
cases that are just copying an empty string, but also the ones that
really are allocating a piece of data and end up copying it.

I'm OK with this approach in the meantime, though. There's a fair bit of
refactoring to get to either of those end-states, and this clears up the
confusing memory ownership issues. And I don't think for-each-ref is
_too_ sensitive to a few extra mallocs (as compared to say, cat-file's
formatter, which is often run once per object).

I didn't dig into the valgrind errors you saw, but I'd guess they are
the result of this patch missing one of these cases (if not a string
literal like "", perhaps a const pointer into another heap string).

-Peff


Re: [PATCH] config.mak.dev: add -Wformat

2018-10-12 Thread Jonathan Nieder
Jeff King wrote:
> On Fri, Oct 12, 2018 at 07:40:37PM +0100, Thomas Gummerer wrote:

>> 801fa63a90 ("config.mak.dev: add -Wformat-security", 2018-09-08) added
>> the -Wformat-security to the flags set in config.mak.dev.  In the gcc
>> man page this is documented as:
>>
>>  If -Wformat is specified, also warn about uses of format
>>  functions that represent possible security problems.  [...]
>>
>> That commit did however not add the -Wformat flag, and -Wformat is not
>> specified anywhere else by default, so the added -Wformat-security had
>> no effect.  Newer versions of gcc (gcc 8.2.1 in this particular case)
>> warn about this and thus compilation fails with this option set.
[...]
> -Wformat is part of -Wall, which we already turn on by default (even for
> non-developer builds).
>
> So I don't think we need to do anything more, though I'm puzzled that
> you saw a failure. Do you set CFLAGS explicitly in your config.mak to
> something that doesn't include -Wall?

Thomas, do you use autoconf to generate config.mak.autogen?  I'm
wondering if that produces a CFLAGS that doesn't include -Wall.

> I'm not opposed to making config.mak.dev a bit more redundant to handle
> this case, but we'd probably want to include all of -Wall, since it
> contains many other warnings we'd want to make sure are enabled.

Do you mean putting -Wall instead of -Wformat?

Should we add -Wextra too?  From a quick test, it seems to build okay.

Thanks,
Jonathan


Re: [RFC PATCH 00/19] Bring more repository handles into our code base

2018-10-12 Thread Jonathan Nieder
Hi,

Stefan Beller wrote:

> This applies on nd/the-index (b3c7eef9b05) and is the logical continuation of
> the object store series, which I sent over the last year.
>
> The previous series did take a very slow and pedantic approach,
> using a #define trick, see cfc62fc98c for details, but it turns out,
> that it doesn't work:

Thanks for the heads up --- this will remind me to review this new
series more carefully, since it differs from what was reviewed before.

I think this will be easiest to review with --function-context.  I can
generate that diff locally, so no need to resend.

>When changing the signature of widely used functions, it burdens the
>maintainer in resolving the semantic conflicts.
>
>In the orginal approach this was called a feature, as then we can ensure
>that not bugs creep into the code base during the merge window (while such
>a refactoring series wanders from pu to master). It turns out this
>was not well received and was just burdensome.

I don't agree with this characterization.

The question of who resolves conflicts is separate from the question
of whether conflicts appear, which is in turn separate from the
question of whether the build breaks.

I consider making the build break when a caller tries to use a
half-converted function too early to be a very useful feature.  There
is a way to do that in C++ that allows decoupled conversions, but the
C version forced an ordering of the conversions.  It seems that the
pain was caused by the combination of

 1. that coupling, which forced an ordering on the conversions and
prevented us from ordering the patches in an order based on
convenience of integration (unlike e.g. the "struct object_id"
series which was able to proceed by taking a batch covering a
quiet area of the tree at a time)

 2. as you mentioned, removal of old API at the same time of addition
 of new API forced callers across the tree to update at once

 3. the lack of having decided how to handle the anticipated churn

Now most of the conversions are done (thanks much for that) so the
ordering (1) is not the main remaining pain point.  Thanks for
tackling the other two in this series.

I want future API changes to be easier.  That means tackling the
following questions up front:

 i. Where does this fit in Rusty's API rating scheme
?
Does misuse (or misconverted callers) break the build, break
visibly at runtime, or are the effects more subtle?

 ii. Is there good test coverage for the new API?  Are there tests
 that need to be migrated?

 iii. Is there a way to automatically migrate callers, or does this
  require manual, error-prone work (thanks for tackling that in
  this one.)

 iv. How are we planning to handle multiple patches in flight?  Will
 the change produce merge conflicts?  How can others on the list
 help the maintainer with integrating this set of changes?

 iv. Is the ending point cleaner than where we started?

The #define trick you're referring to was a way of addressing (i).

[...]
>  79 files changed, 571 insertions(+), 278 deletions(-)

Most of the increase is in the coccinelle file and in improved
documentation.

It appears that some patches use a the_index-style
NO_THE_REPOSITORY_COMPATIBILITY_MACROS backward compatibility synonym
and others don't.  Can you say a little more about this aspect of the
approach?  Would the compatibility macros go away eventually?

Thanks,
Jonathan


Re: [PATCH] config.mak.dev: add -Wformat

2018-10-12 Thread Jeff King
On Fri, Oct 12, 2018 at 07:40:37PM +0100, Thomas Gummerer wrote:

> 801fa63a90 ("config.mak.dev: add -Wformat-security", 2018-09-08) added
> the -Wformat-security to the flags set in config.mak.dev.  In the gcc
> man page this is documented as:
> 
>  If -Wformat is specified, also warn about uses of format
>  functions that represent possible security problems.  [...]
> 
> That commit did however not add the -Wformat flag, and -Wformat is not
> specified anywhere else by default, so the added -Wformat-security had
> no effect.  Newer versions of gcc (gcc 8.2.1 in this particular case)
> warn about this and thus compilation fails with this option set.
> 
> Fix that, and make -Wformat-security actually useful by adding the
> -Wformat flag as well.  git compiles cleanly with both these flags
> applied.

-Wformat is part of -Wall, which we already turn on by default (even for
non-developer builds).

So I don't think we need to do anything more, though I'm puzzled that
you saw a failure. Do you set CFLAGS explicitly in your config.mak to
something that doesn't include -Wall?

I'm not opposed to making config.mak.dev a bit more redundant to handle
this case, but we'd probably want to include all of -Wall, since it
contains many other warnings we'd want to make sure are enabled.

-Peff


[PATCH] config.mak.dev: add -Wformat

2018-10-12 Thread Thomas Gummerer
801fa63a90 ("config.mak.dev: add -Wformat-security", 2018-09-08) added
the -Wformat-security to the flags set in config.mak.dev.  In the gcc
man page this is documented as:

 If -Wformat is specified, also warn about uses of format
 functions that represent possible security problems.  [...]

That commit did however not add the -Wformat flag, and -Wformat is not
specified anywhere else by default, so the added -Wformat-security had
no effect.  Newer versions of gcc (gcc 8.2.1 in this particular case)
warn about this and thus compilation fails with this option set.

Fix that, and make -Wformat-security actually useful by adding the
-Wformat flag as well.  git compiles cleanly with both these flags
applied.

Signed-off-by: Thomas Gummerer 
---

Sorry for not catching this before the patch made it to next.  

 config.mak.dev | 1 +
 1 file changed, 1 insertion(+)

diff --git a/config.mak.dev b/config.mak.dev
index 92d268137f..bf6f943452 100644
--- a/config.mak.dev
+++ b/config.mak.dev
@@ -7,6 +7,7 @@ CFLAGS += -pedantic
 CFLAGS += -DUSE_PARENS_AROUND_GETTEXT_N=0
 endif
 CFLAGS += -Wdeclaration-after-statement
+CFLAGS += -Wformat
 CFLAGS += -Wformat-security
 CFLAGS += -Wno-format-zero-length
 CFLAGS += -Wold-style-definition
-- 
2.19.1.937.g12227c8702.dirty



issue: strange `git diff --numstat` behavior.

2018-10-12 Thread Sergey Andreenko
git diff –numstat FOLDER1 FOLDER2 works strange when run from a git
controlled folder.

The output shrinks some symbols in the diff file paths.



For example:

Create a folder and call git init, for example: `C:\test`.

  mkdir C:\test

  cd C:\test

  git init

Create two folders with to diff. For example: ` C:\diff`, `C:\base`
and put a file in them with a diff (for example `1.txt` with `1` in
base and `1.txt` with 2 in diff).

  mkdir C:\base

  mkdir C:\diff

  echo. 12>C:\base\1.txt

  echo 13>C:\diff\1.txt

Run git diff:

pushd C:\

git.exe diff --numstat "C:\diff" "C:\base"

Output will be:

1   1   "C:\\diff/1.txt" => "C:\\base/1.txt"

Now move into C:\test and run it again:

pushd C:\test

git.exe diff --numstat "C:\diff" "C:\base"

1   1   "C:\\diff/1.txt" => "C:\\base/1.txt"

Now create a folder in `C:\test`, for example `one`:

mkdir one

cd one

git.exe diff --numstat "C:\diff" "C:\base"



output will be:

  0   1   {iff => ase}/1.txt

So (folder_name_length) symbols were cut from the path (“C:\\d” and “C:\\b”).



Is any way to avoid that? I have checked several git versions and they
all do the same.



Commands to repro:

mkdir C:\test

cd C:\test

git init

mkdir C:\base

mkdir C:\diff

echo. 12>C:\base\1.txt

echo 13>C:\diff\1.txt

pushd C:\

git.exe diff --numstat "C:\diff" "C:\base"

pushd C:\test

git.exe diff --numstat "C:\diff" "C:\base"

mkdir one

cd one

git.exe diff --numstat "C:\diff" "C:\base"



Best Regards,

Sergey Andrenko


Re: [PATCHv1 2/3] git-p4: unshelve into refs/remotes/p4-unshelved, not refs/remotes/p4/unshelved

2018-10-12 Thread Luke Diamand
On Fri, 12 Oct 2018 at 14:45, Junio C Hamano  wrote:
>
> Luke Diamand  writes:
>
> > The branch detection code looks for branches under refs/remotes/p4/...
> > and can end up getting confused if there are unshelved changes in
> > there as well. This happens in the function p4BranchesInGit().
> >
> > Instead, put the unshelved changes into refs/remotes/p4-unshelved/.
>
> I am not a p4 user (and not a git-p4 user), so it is a bit hard for
> me to assess if this is a backward incompatibile change and if so
> how serious potential breakage to existing users would be.

I don't think it's a particularly serious breakage - it reports the
branch it unshelves to, so it should be fairly obvious.

However, maybe it would make sense to pull this into a separate commit
to make it more obvious? I should have thought of that before
submitting.

>
> >
> > -If the target branch in refs/remotes/p4/unshelved already exists, the old 
> > one will
> > +If the target branch in refs/remotes/p4-unshelved already exists, the old 
> > one will
> >  be renamed.
> >
> >  
> >  $ git p4 sync
> >  $ git p4 unshelve 12345
> > -$ git show refs/remotes/p4/unshelved/12345
> > +$ git show p4/unshelved/12345
>
> Isn't this "p4-unshelved/12345" now?

Yes, I think another reason to pull into a separate commit.

Luke

>


Re: [PATCH v2 0/3] Add GIT_TEST_MULTI_PACK_INDEX environment variable

2018-10-12 Thread Derrick Stolee

On 10/12/2018 1:34 PM, Derrick Stolee via GitGitGadget wrote:

To increase coverage of the multi-pack-index feature, add a
GIT_TEST_MULTI_PACK_INDEX environment variable similar to other GIT_TEST_*
variables.

After creating the environment variable and running the test suite with it
enabled, I found a few bugs in the multi-pack-index implementation. These
are handled by the first two patches.

I have set up a CI build on Azure Pipelines [1] that runs the test suite
with a few optional features enabled, including GIT_TEST_MULTI_PACK_INDEX
and GIT_TEST_COMMIT_GRAPH. I'll use this to watch the features and ensure
they work well with the rest of the ongoing work. Eventually, we can add
these variables to the Travis CI scripts.

[1] https://git.visualstudio.com/git/_build?definitionId=4

Derrick Stolee (3):
   midx: fix broken free() in close_midx()
   midx: close multi-pack-index on repack
   multi-pack-index: define GIT_TEST_MULTI_PACK_INDEX

  builtin/repack.c|  7 +--
  midx.c  | 26 --
  midx.h  |  6 +-
  t/README|  4 
  t/t5310-pack-bitmaps.sh |  1 +
  t/t5319-multi-pack-index.sh |  2 +-
  t/t9300-fast-import.sh  |  2 +-
  7 files changed, 37 insertions(+), 11 deletions(-)


base-commit: 5a0cc8aca797dbd7d2be3b67458ff880ed45cddf
I should explicitly mention that this base commit is different as 
otherwise I will conflict with ds/multi-pack-verify with the new 
prototype in midx.h.


Thanks,
-Stolee


[PATCH v2 2/3] midx: close multi-pack-index on repack

2018-10-12 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee 

When repacking, we may remove pack-files. This invalidates the
multi-pack-index (if it exists). Previously, we removed the
multi-pack-index file before removing any pack-file. In some cases,
the repack command may load the multi-pack-index into memory. This
may lead to later in-memory references to the non-existent pack-
files.

Signed-off-by: Derrick Stolee 
---
 builtin/repack.c |  3 +--
 midx.c   | 15 ---
 midx.h   |  4 +++-
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index c6a7943d5c..44965cbaa3 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -431,8 +431,7 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
char *fname, *fname_old;
 
if (!midx_cleared) {
-   /* if we move a packfile, it will invalidated 
the midx */
-   clear_midx_file(get_object_directory());
+   clear_midx_file(the_repository);
midx_cleared = 1;
}
 
diff --git a/midx.c b/midx.c
index bf1f511862..22247a30ab 100644
--- a/midx.c
+++ b/midx.c
@@ -176,9 +176,13 @@ cleanup_fail:
return NULL;
 }
 
-static void close_midx(struct multi_pack_index *m)
+void close_midx(struct multi_pack_index *m)
 {
uint32_t i;
+
+   if (!m)
+   return;
+
munmap((unsigned char *)m->data, m->data_len);
close(m->fd);
m->fd = -1;
@@ -914,9 +918,14 @@ cleanup:
return 0;
 }
 
-void clear_midx_file(const char *object_dir)
+void clear_midx_file(struct repository *r)
 {
-   char *midx = get_midx_filename(object_dir);
+   char *midx = get_midx_filename(r->objects->objectdir);
+
+   if (r->objects && r->objects->multi_pack_index) {
+   close_midx(r->objects->multi_pack_index);
+   r->objects->multi_pack_index = NULL;
+   }
 
if (remove_path(midx)) {
UNLEAK(midx);
diff --git a/midx.h b/midx.h
index ce80b91c68..0f68bccdd5 100644
--- a/midx.h
+++ b/midx.h
@@ -42,7 +42,9 @@ int midx_contains_pack(struct multi_pack_index *m, const char 
*idx_name);
 int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, 
int local);
 
 int write_midx_file(const char *object_dir);
-void clear_midx_file(const char *object_dir);
+void clear_midx_file(struct repository *r);
 int verify_midx_file(const char *object_dir);
 
+void close_midx(struct multi_pack_index *m);
+
 #endif
-- 
gitgitgadget



[PATCH v2 3/3] multi-pack-index: define GIT_TEST_MULTI_PACK_INDEX

2018-10-12 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee 

The multi-pack-index feature is tested in isolation by
t5319-multi-pack-index.sh, but there are many more interesting
scenarios in the test suite surrounding pack-file data shapes
and interactions. Since the multi-pack-index is an optional
data structure, it does not make sense to include it by default
in those tests.

Instead, add a new GIT_TEST_MULTI_PACK_INDEX environment variable
that enables core.multiPackIndex and writes a multi-pack-index
after each 'git repack' command. This adds extra test coverage
when needed.

There are a few spots in the test suite that need to react to this
change:

* t5319-multi-pack-index.sh: there is a test that checks that
  'git repack' deletes the multi-pack-index. Disable the environment
  variable to ensure this still happens.

* t5310-pack-bitmaps.sh: One test moves a pack-file from the object
  directory to an alternate. This breaks the multi-pack-index, so
  delete the multi-pack-index at this point, if it exists.

* t9300-fast-import.sh: One test verifies the number of files in
  the .git/objects/pack directory is exactly 8. Exclude the
  multi-pack-index from this count so it is still 8 in all cases.

Signed-off-by: Derrick Stolee 
---
 builtin/repack.c| 4 
 midx.c  | 9 +++--
 midx.h  | 2 ++
 t/README| 4 
 t/t5310-pack-bitmaps.sh | 1 +
 t/t5319-multi-pack-index.sh | 2 +-
 t/t9300-fast-import.sh  | 2 +-
 7 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index 44965cbaa3..26dcccdafc 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -553,6 +553,10 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
if (!no_update_server_info)
update_server_info(0);
remove_temporary_files();
+
+   if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0))
+   write_midx_file(get_object_directory());
+
string_list_clear(, 0);
string_list_clear(, 0);
string_list_clear(_packs, 0);
diff --git a/midx.c b/midx.c
index 22247a30ab..02b2211e31 100644
--- a/midx.c
+++ b/midx.c
@@ -335,9 +335,14 @@ int prepare_multi_pack_index_one(struct repository *r, 
const char *object_dir, i
struct multi_pack_index *m;
struct multi_pack_index *m_search;
int config_value;
+   static int env_value = -1;
 
-   if (repo_config_get_bool(r, "core.multipackindex", _value) ||
-   !config_value)
+   if (env_value < 0)
+   env_value = git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0);
+
+   if (!env_value &&
+   (repo_config_get_bool(r, "core.multipackindex", _value) ||
+   !config_value))
return 0;
 
for (m_search = r->objects->multi_pack_index; m_search; m_search = 
m_search->next)
diff --git a/midx.h b/midx.h
index 0f68bccdd5..f2bb7e681c 100644
--- a/midx.h
+++ b/midx.h
@@ -3,6 +3,8 @@
 
 #include "repository.h"
 
+#define GIT_TEST_MULTI_PACK_INDEX "GIT_TEST_MULTI_PACK_INDEX"
+
 struct multi_pack_index {
struct multi_pack_index *next;
 
diff --git a/t/README b/t/README
index 5e48a043ce..9bfdd3004c 100644
--- a/t/README
+++ b/t/README
@@ -327,6 +327,10 @@ GIT_TEST_COMMIT_GRAPH=, when true, forces the 
commit-graph to
 be written after every 'git commit' command, and overrides the
 'core.commitGraph' setting to true.
 
+GIT_TEST_MULTI_PACK_INDEX=, when true, forces the multi-pack-
+index to be written after every 'git repack' command, and overrides the
+'core.multiPackIndex' setting to true.
+
 Naming Tests
 
 
diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index 1be3459c5b..82d7f7f6a5 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -191,6 +191,7 @@ test_expect_success 'pack-objects respects 
--honor-pack-keep (local bitmapped pa
 
 test_expect_success 'pack-objects respects --local (non-local bitmapped pack)' 
'
mv .git/objects/pack/$packbitmap.* alt.git/objects/pack/ &&
+   rm -f .git/objects/pack/multi-pack-index &&
test_when_finished "mv alt.git/objects/pack/$packbitmap.* 
.git/objects/pack/" &&
echo HEAD | git pack-objects --local --stdout --revs >3b.pack &&
git index-pack 3b.pack &&
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index bd8e841b81..70926b5bc0 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -271,7 +271,7 @@ test_expect_success 'git-fsck incorrect offset' '
 
 test_expect_success 'repack removes multi-pack-index' '
test_path_is_file $objdir/pack/multi-pack-index &&
-   git repack -adf &&
+   GIT_TEST_MULTI_PACK_INDEX=0 git repack -adf &&
test_path_is_missing $objdir/pack/multi-pack-index
 '
 
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 40fe7e4976..59a13b6a77 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -1558,7 +1558,7 @@ test_expect_success 'O: 

[PATCH v2 1/3] midx: fix broken free() in close_midx()

2018-10-12 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee 

When closing a multi-pack-index, we intend to close each pack-file
and free the struct packed_git that represents it. However, this
line was previously freeing the array of pointers, not the
pointer itself. This leads to a double-free issue.

Signed-off-by: Derrick Stolee 
---
 midx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/midx.c b/midx.c
index 713d6f9dde..bf1f511862 100644
--- a/midx.c
+++ b/midx.c
@@ -186,7 +186,7 @@ static void close_midx(struct multi_pack_index *m)
for (i = 0; i < m->num_packs; i++) {
if (m->packs[i]) {
close_pack(m->packs[i]);
-   free(m->packs);
+   free(m->packs[i]);
}
}
FREE_AND_NULL(m->packs);
-- 
gitgitgadget



[PATCH v2 0/3] Add GIT_TEST_MULTI_PACK_INDEX environment variable

2018-10-12 Thread Derrick Stolee via GitGitGadget
To increase coverage of the multi-pack-index feature, add a
GIT_TEST_MULTI_PACK_INDEX environment variable similar to other GIT_TEST_*
variables.

After creating the environment variable and running the test suite with it
enabled, I found a few bugs in the multi-pack-index implementation. These
are handled by the first two patches.

I have set up a CI build on Azure Pipelines [1] that runs the test suite
with a few optional features enabled, including GIT_TEST_MULTI_PACK_INDEX
and GIT_TEST_COMMIT_GRAPH. I'll use this to watch the features and ensure
they work well with the rest of the ongoing work. Eventually, we can add
these variables to the Travis CI scripts.

[1] https://git.visualstudio.com/git/_build?definitionId=4

Derrick Stolee (3):
  midx: fix broken free() in close_midx()
  midx: close multi-pack-index on repack
  multi-pack-index: define GIT_TEST_MULTI_PACK_INDEX

 builtin/repack.c|  7 +--
 midx.c  | 26 --
 midx.h  |  6 +-
 t/README|  4 
 t/t5310-pack-bitmaps.sh |  1 +
 t/t5319-multi-pack-index.sh |  2 +-
 t/t9300-fast-import.sh  |  2 +-
 7 files changed, 37 insertions(+), 11 deletions(-)


base-commit: 5a0cc8aca797dbd7d2be3b67458ff880ed45cddf
Published-As: 
https://github.com/gitgitgadget/git/releases/tags/pr-27%2Fderrickstolee%2Fmidx-test%2Fupstream-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git 
pr-27/derrickstolee/midx-test/upstream-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/27

Range-diff vs v1:

 1:  9fcbbe336d = 1:  8bd672fe26 midx: fix broken free() in close_midx()
 2:  725ebadc92 ! 2:  2d8f26679d midx: close multi-pack-index on repack
 @@ -15,16 +15,15 @@
  --- a/builtin/repack.c
  +++ b/builtin/repack.c
  @@
 +  char *fname, *fname_old;
   
if (!midx_cleared) {
 -  /* if we move a packfile, it will invalidated 
the midx */
 -+ if (the_repository->objects) {
 -+ 
close_midx(the_repository->objects->multi_pack_index);
 -+ 
the_repository->objects->multi_pack_index = NULL;
 -+ }
 -  clear_midx_file(get_object_directory());
 +- /* if we move a packfile, it will invalidated 
the midx */
 +- clear_midx_file(get_object_directory());
 ++ clear_midx_file(the_repository);
midx_cleared = 1;
}
 + 
  
  diff --git a/midx.c b/midx.c
  --- a/midx.c
 @@ -44,13 +43,34 @@
munmap((unsigned char *)m->data, m->data_len);
close(m->fd);
m->fd = -1;
 +@@
 +  return 0;
 + }
 + 
 +-void clear_midx_file(const char *object_dir)
 ++void clear_midx_file(struct repository *r)
 + {
 +- char *midx = get_midx_filename(object_dir);
 ++ char *midx = get_midx_filename(r->objects->objectdir);
 ++
 ++ if (r->objects && r->objects->multi_pack_index) {
 ++ close_midx(r->objects->multi_pack_index);
 ++ r->objects->multi_pack_index = NULL;
 ++ }
 + 
 +  if (remove_path(midx)) {
 +  UNLEAK(midx);
  
  diff --git a/midx.h b/midx.h
  --- a/midx.h
  +++ b/midx.h
  @@
 + int prepare_multi_pack_index_one(struct repository *r, const char 
*object_dir, int local);
 + 
   int write_midx_file(const char *object_dir);
 - void clear_midx_file(const char *object_dir);
 +-void clear_midx_file(const char *object_dir);
 ++void clear_midx_file(struct repository *r);
 + int verify_midx_file(const char *object_dir);
   
  +void close_midx(struct multi_pack_index *m);
  +
 3:  04e3e91082 = 3:  57c64e814c multi-pack-index: define 
GIT_TEST_MULTI_PACK_INDEX

-- 
gitgitgadget


Does git load index file into memory?

2018-10-12 Thread Farhan Khan
Hi all,

Does git load the entire index file into memory when it wants to
edit/view it? I ask because I wonder if this can become a problem with
the index file becomes arbitrarily large, like for the Linux kernel.

Thanks,
--
Farhan Khan
PGP Fingerprint: B28D 2726 E2BC A97E 3854 5ABE 9A9F 00BC D525 16EE


Re: [PATCH v3 4/7] revision.c: begin refactoring --topo-order logic

2018-10-12 Thread Johannes Sixt

Am 12.10.18 um 08:33 schrieb Junio C Hamano:

"Derrick Stolee via GitGitGadget"  writes:

+struct topo_walk_info {};
+
+static void init_topo_walk(struct rev_info *revs)
+{
+   struct topo_walk_info *info;
+   revs->topo_walk_info = xmalloc(sizeof(struct topo_walk_info));
+   info = revs->topo_walk_info;
+   memset(info, 0, sizeof(struct topo_walk_info));


There is no member in the struct at this point.  Are we sure this is
safe?  Just being curious.


sizeof cannot return 0. sizeof(struct topo_walk_info) will be 1 here.

-- Hannes


Re: [PATCH] zlib.c: use size_t for size

2018-10-12 Thread Johannes Schindelin
Hi Junio,

On Fri, 12 Oct 2018, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > Hi Junio,
> >
> > On Fri, 12 Oct 2018, Junio C Hamano wrote:
> >
> >> From: Martin Koegler 
> >> Date: Thu, 10 Aug 2017 20:13:08 +0200
> >> 
> >> Signed-off-by: Martin Koegler 
> >> Signed-off-by: Junio C Hamano 
> >> ---
> >> 
> >>  * I made minimal adjustments to make the change apply to today's
> >>codebase.  I still find some choices and mixing of off_t and
> >>size_t done by the patch a bit iffy, and some arith may need to
> >>become st_addX().  Extra set of eyes are very much appreciated.
> >> 
> >>  builtin/pack-objects.c | 10 +-
> >>  cache.h| 10 +-
> >>  pack-check.c   |  6 +++---
> >>  pack.h |  2 +-
> >>  packfile.h |  2 +-
> >>  wrapper.c  |  8 
> >>  zlib.c |  8 
> >>  7 files changed, 23 insertions(+), 23 deletions(-)
> >> 
> >> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> >> index e6316d294d..b9ca04eb8a 100644
> >> --- a/builtin/pack-objects.c
> >> +++ b/builtin/pack-objects.c
> >> @@ -266,15 +266,15 @@ static void copy_pack_data(struct hashfile *f,
> >>struct packed_git *p,
> >>struct pack_window **w_curs,
> >>off_t offset,
> >> -  off_t len)
> >> +  size_t len)
> >>  {
> >>unsigned char *in;
> >> -  unsigned long avail;
> >> +  size_t avail;
> >>  
> >>while (len) {
> >>in = use_pack(p, w_curs, offset, );
> >>if (avail > len)
> >
> > Do we have to be careful to handle sizeof(off_t) != sizeof(size_t) here? I
> > guess we would receive a compiler warning about different sizes in that
> > case, but it makes me worry.
> 
> Yup, you just said exactly the same thing as I already said in the
> part you quoted.  I still find the mixed use of off_t and size_t in
> this patch iffy, and that was the secondary reason why the patch was
> kept in the stalled state for so long.  The primary reason was that
> nobody tried to dust it off and reignite the topic so far---which I
> am trying to correct, but as I said, this is just minimally adjusted
> to today's codebase, without any attempt to improve relative to the
> original patch.
> 
> I think we are in agreement in that this is not making things worse,
> as we are already in the mixed arith territory before this patch.

I will *try* to find the time to audit this some more, then. Maybe next
week, maybe the one afterwards... ;-)

Ciao,
Dscho

> 
> 


Re: [PATCH v3 0/2] rebase -i: introduce the 'break' command

2018-10-12 Thread Johannes Schindelin
Hi Junio,

On Fri, 12 Oct 2018, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" 
> writes:
> 
> > This patch introduces that, based on ag/rebase-i-in-c.
> >
> > Changes since v2:
> >
> >  * Fixed two typos.
> >  * When interrupting the rebase, break now outputs a message.
> 
> I was about to merge the whole collection of topics to 'next', but
> this came to stop me just in time.
> 
> The typofixes are appreciated of course, and look trivially correct.
> 
> I find that the if() condition that does too many side-effect-full
> operations in stopped-at-head shows bad taste.  It is still short
> enough to understand what each step is trying to do, but would
> encourage others who later may touch the code to make it even more
> complex.
> 
> But it is a short and isolated static helper function, so I won't
> complain too loudly.
> 
> Will replace and requeue.

Thanks,
Dscho

> 
> Thanks.
> 


Re: [PATCH v3 2/2] rebase -i: introduce the 'break' command

2018-10-12 Thread Johannes Schindelin
Hi Junio,

On Fri, 12 Oct 2018, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" 
> writes:
> 
> > @@ -3293,6 +3312,9 @@ static int pick_commits(struct todo_list *todo_list, 
> > struct replay_opts *opts)
> > unlink(rebase_path_stopped_sha());
> > unlink(rebase_path_amend());
> > delete_ref(NULL, "REBASE_HEAD", NULL, REF_NO_DEREF);
> > +
> > +   if (item->command == TODO_BREAK)
> > +   return stopped_at_head();
> > }
> 
> The earlier one had "break;" here, which broke out of the while()
> loop, let the control reach "if (is_reabse_i(opts)) {" block after
> the loop, and the block would have noticed that current < nr and
> returned 0.  So from the point of view of the overall control flow
> of the caller of this function, there is no change relative to v2.
> The only difference in v3 is that stopped_at_head() gives a useful
> message.

Well, I should have called out this `break;` -> `return 0;` change, too,
as I think it makes the code flow *a lot* more obvious. As you say, I
relied earlier on the next loop to return early, but that is not what the
`edit` command does: it returns out of the first loop, too.

> 
> Good.

Thanks,
Dscho
> 
> 


Re: [PATCH v3 1/2] commit-graph write: add progress output

2018-10-12 Thread Derrick Stolee

On 10/12/2018 11:07 AM, Ævar Arnfjörð Bjarmason wrote:

On Fri, Oct 12 2018, Junio C Hamano wrote:


Makes sense. If this second iteration were also time consuming,
then it probably is a good idea to split these into two separate
phases?  "Counting 1...N" followed by "Inspecting 1...N" or
something like that.  Of course, if the latter does not take much
time, then doing without any progress indicator is also fine.

That's a good point. Derrick: If the three loops in close_reachable()
had to be split up into different progress stages and given different
names what do you think they should be? Now it's "Annotating commits in
commit graph" for all of them.


The following is the best I can think of right now:

1. Loading known commits.
2. Expanding reachable commits.
3. Clearing commit marks.

-Stolee


Re: [PATCH v3 1/2] commit-graph write: add progress output

2018-10-12 Thread Ævar Arnfjörð Bjarmason


On Fri, Oct 12 2018, Junio C Hamano wrote:

> SZEDER Gábor  writes:
>
>>> for (i = 0; i < oids->nr; i++) {
>>> +   display_progress(progress, ++j);
>>> commit = lookup_commit(the_repository, >list[i]);
>>>
>>> if (commit && !parse_commit(commit))
>>> @@ -611,19 +624,28 @@ static void close_reachable(struct packed_oid_list 
>>> *oids)
>>> }
>>
>> The above loops have already counted all the commits, and, more
>> importantly, did all the hard work that takes time and makes the
>> progress indicator useful.
>>
>>> for (i = 0; i < oids->nr; i++) {
>>> +   display_progress(progress, ++j);
>>
>> This display_progress() call, however, doesn't seem to be necessary.
>> First, it counts all commits for a second time, resulting in the ~2x
>> difference compared to the actual number of commits, and then causing
>> my confusion.  Second, all what this loop is doing is setting a flag
>> in commits that were already looked up and parsed in the above loops.
>> IOW this loop is very fast, and the progress indicator jumps from
>> ~780k right to 1.5M, even on my tiny laptop, so it doesn't need a
>> progress indicator at all.
>
> Makes sense.  If this second iteration were also time consuming,
> then it probably is a good idea to split these into two separate
> phases?  "Counting 1...N" followed by "Inspecting 1...N" or
> something like that.  Of course, if the latter does not take much
> time, then doing without any progress indicator is also fine.

That's a good point. Derrick: If the three loops in close_reachable()
had to be split up into different progress stages and given different
names what do you think they should be? Now it's "Annotating commits in
commit graph" for all of them.


Re: What's cooking in git.git (Oct 2018, #02; Sat, 13)

2018-10-12 Thread Ævar Arnfjörð Bjarmason


On Fri, Oct 12 2018, Junio C Hamano wrote:

> * ab/gc-doc-update (2018-10-11) 1 commit
>  - gc doc: mention the commit-graph in the intro
>
>  The documentation of "git gc" has been updated to mention that it
>  is no longer limited to "pruning away crufts" but also updates
>  ancillary files like commit-graph as a part of repository
>  optimization.
>
>  Waiting for reactions.
>  cf. <20181010193818.20399-1-ava...@gmail.com>
>  The author seems to feel that this might be controversial.

Probably shouldn't have mentioned that. SZEDER I think had more issues
with the "that's not GC" parts of what git-gc is doing, so I'll let him
chime in. I'm working on a more general update to gc.c which might bring
it more in the "not really GC anymore" direction.

But this patch is just describing the status quo, so I think it should
be uncontroversial to merge it, or at least we can have the discussion
about what we should do in a different venue than a doc fixup for
accurately describing what we're doing now.

Another topic: As noted in <20181010193235.17359-1-ava...@gmail.com> and
this follow-up
https://public-inbox.org/git/87h8hsexdm@evledraar.gmail.com/ it
would be great if you could pick up the "gc: remove redundant check for
gc_auto_threshold" patch. A small change that makes later refactoring a
tiny bit smaller and easier to understand.



What's cooking in git.git (Oct 2018, #02; Sat, 13)

2018-10-12 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

Quite a lot of topics have been merged to 'next'.  Let's see how
well they fare.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[New Topics]

* ab/gc-doc-update (2018-10-11) 1 commit
 - gc doc: mention the commit-graph in the intro

 The documentation of "git gc" has been updated to mention that it
 is no longer limited to "pruning away crufts" but also updates
 ancillary files like commit-graph as a part of repository
 optimization.

 Waiting for reactions.
 cf. <20181010193818.20399-1-ava...@gmail.com>
 The author seems to feel that this might be controversial.


* du/branch-show-current (2018-10-12) 2 commits
 - SQUASH???
 - branch: introduce --show-current display option

 "git branch" learned a new subcommand "--show-current".


* du/rev-parse-is-plumbing (2018-10-11) 1 commit
 - doc: move git-rev-parse from porcelain to plumbing

 Doc update.

 Will merge to 'next'.


* mm/doc-no-dashed-git (2018-10-11) 1 commit
 - doc: fix a typo and clarify a sentence

 Doc update.

 Will merge to 'next'.


* ot/ref-filter-plug-leaks (2018-10-11) 3 commits
 . ref-filter: free item->value and item->value->s
 . ls-remote: release memory instead of UNLEAK
 . ref-filter: free memory from used_atom

 Plugging a handful of memory leaks in the ref-filter codepath.

 Seems to break t6300 standalone and when merged to 'pu'.


* rv/send-email-cc-misc-by (2018-10-11) 3 commits
 - send-email: also pick up cc addresses from -by trailers
 - send-email: only consider lines containing @ or <> for automatic Cc'ing
 - Documentation/git-send-email.txt: style fixes

 "git send-email" learned to grab address-looking string on any
 trailer whose name ends with "-by"; --suppress-cc=misc-by on the
 command line, or setting sendemail.suppresscc configuration
 variable to "misc-by", can be used to disable this behaviour.

 This is a backward-incompatible change that may surprise existing
 users.


* du/cherry-is-plumbing (2018-10-12) 1 commit
 - doc: move git-cherry to plumbing

 Doc update to mark "git cherry" as a plumbing command.


* du/get-tar-commit-id-is-plumbing (2018-10-12) 1 commit
 - doc: move git-get-tar-commit-id to plumbing

 Doc update to mark "git get-tar-commit-id" as a plumbing command.

 Will merge to 'next'.


* lm/range-diff-submodule-fix (2018-10-12) 1 commit
 - range-diff: allow to diff files regardless submodule

 "git range-diff" did not work well when the compared ranges had
 changes in submodules and the "--submodule=log" was used.

 Will merge to 'next'.


* sb/diff-emit-line-ws-markup-cleanup (2018-10-12) 1 commit
 - diff.c: pass sign_index to emit_line_ws_markup

 Code clean-up.

 Will merge to 'next'.

--
[Stalled]

* lt/date-human (2018-07-09) 1 commit
 - Add 'human' date format

 A new date format "--date=human" that morphs its output depending
 on how far the time is from the current time has been introduced.
 "--date=auto" can be used to use this new format when the output is
 goint to the pager or to the terminal and otherwise the default
 format.

--
[Cooking]

* mk/use-size-t-in-zlib (2018-10-12) 1 commit
 - zlib.c: use size_t for size

 The wrapper to call into zlib followed our long tradition to use
 "unsigned long" for sizes of regions in memory, which have been
 updated to use "size_t".


* ds/reachable-final-cleanup (2018-09-25) 1 commit
  (merged to 'next' on 2018-10-12 at 32c6d5f55a)
 + commit-reach: cleanups in can_all_from_reach...

 Code already in 'master' is further cleaned-up by this patch.

 Will merge to 'master'.


* dz/credential-doc-url-matching-rules (2018-09-27) 1 commit
  (merged to 'next' on 2018-10-12 at 4547952530)
 + doc: clarify gitcredentials path component matching

 Doc update.

 Will merge to 'master'.


* en/status-multiple-renames-to-the-same-target-fix (2018-09-27) 1 commit
  (merged to 'next' on 2018-10-12 at 4976fc61a0)
 + commit: fix erroneous BUG, 'multiple renames on the same target? how?'

 The code in "git status" sometimes hit an assertion failure.  This
 was caused by a structure that was reused without cleaning the data
 used for the first run, which has been corrected.

 Will merge to 'master'.


* jc/how-to-document-api (2018-09-29) 1 commit
  (merged to 'next' on 2018-10-12 at 7c9bd82285)
 + CodingGuidelines: document the API in *.h files

 Doc update.

 Will merge to 'master'.


* jc/rebase-in-c-5-test-typofix (2018-10-11) 1 commit
 - rebase: fix typoes in error messages
 (this branch uses pk/rebase-in-c, pk/rebase-in-c-2-basic, 

Re: [PATCH] subtree: performance improvement for finding unexpected parent commits

2018-10-12 Thread Junio C Hamano
Roger Strain  writes:

> After testing a previous patch at larger scale, a performance issue was
> detected when using git show to locate parent revisions, with a single
> run of the git show command taking 2 seconds or longer in a complex repo.
> When the command is required tens or hundreds of times in a run of the
> script, the additional wait time is unaccepatable. Replacing the command
> with git rev-parse resulted in significantly increased performance, with
> the command in question returning instantly.
>
> Signed-off-by: Roger Strain 
> Thanks-to: Junio C Hamano 

That usually is spelled as "Helped-by:".

Will queue.  Thanks.

I still find it disturbing not to know why "show -s --format=..."
takes measurable time, though.  "-s" means "we do not need any diff
output", so it ought to be comparable to "git cat-file commit $rev"
with some formatting, but apparently your repository is making Git
spend a lot more than that.  Puzzled...

> ---
>  contrib/subtree/git-subtree.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
> index 1c157dbd9..147201dc6 100755
> --- a/contrib/subtree/git-subtree.sh
> +++ b/contrib/subtree/git-subtree.sh
> @@ -633,7 +633,7 @@ process_split_commit () {
>   else
>   # processing commit without normal parent information;
>   # fetch from repo
> - parents=$(git show -s --pretty=%P "$rev")
> + parents=$(git rev-parse "$rev^@")
>   extracount=$(($extracount + 1))
>   fi


Re: [PATCH v3 2/2] rebase -i: introduce the 'break' command

2018-10-12 Thread Junio C Hamano
"Johannes Schindelin via GitGitGadget" 
writes:

> @@ -3293,6 +3312,9 @@ static int pick_commits(struct todo_list *todo_list, 
> struct replay_opts *opts)
>   unlink(rebase_path_stopped_sha());
>   unlink(rebase_path_amend());
>   delete_ref(NULL, "REBASE_HEAD", NULL, REF_NO_DEREF);
> +
> + if (item->command == TODO_BREAK)
> + return stopped_at_head();
>   }

The earlier one had "break;" here, which broke out of the while()
loop, let the control reach "if (is_reabse_i(opts)) {" block after
the loop, and the block would have noticed that current < nr and
returned 0.  So from the point of view of the overall control flow
of the caller of this function, there is no change relative to v2.
The only difference in v3 is that stopped_at_head() gives a useful
message.

Good.



Re: [PATCH v3 0/2] rebase -i: introduce the 'break' command

2018-10-12 Thread Junio C Hamano
"Johannes Schindelin via GitGitGadget" 
writes:

> This patch introduces that, based on ag/rebase-i-in-c.
>
> Changes since v2:
>
>  * Fixed two typos.
>  * When interrupting the rebase, break now outputs a message.

I was about to merge the whole collection of topics to 'next', but
this came to stop me just in time.

The typofixes are appreciated of course, and look trivially correct.

I find that the if() condition that does too many side-effect-full
operations in stopped-at-head shows bad taste.  It is still short
enough to understand what each step is trying to do, but would
encourage others who later may touch the code to make it even more
complex.

But it is a short and isolated static helper function, so I won't
complain too loudly.

Will replace and requeue.

Thanks.


RE: [PATCH] subtree: performance improvement for finding unexpected parent commits

2018-10-12 Thread Strain, Roger L.
Original failed to include the note that this patch is for the 'next' branch, 
my apologies.

> -Original Message-
> From: Roger Strain 
> Sent: Friday, October 12, 2018 8:52 AM
> To: git@vger.kernel.org
> Cc: Strain, Roger L. 
> Subject: [PATCH] subtree: performance improvement for finding
> unexpected parent commits
> 
> After testing a previous patch at larger scale, a performance issue was
> detected when using git show to locate parent revisions, with a single run of
> the git show command taking 2 seconds or longer in a complex repo.
> When the command is required tens or hundreds of times in a run of the
> script, the additional wait time is unaccepatable. Replacing the command
> with git rev-parse resulted in significantly increased performance, with the
> command in question returning instantly.
> 
> Signed-off-by: Roger Strain 
> Thanks-to: Junio C Hamano 
> ---
>  contrib/subtree/git-subtree.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
> index 1c157dbd9..147201dc6 100755
> --- a/contrib/subtree/git-subtree.sh
> +++ b/contrib/subtree/git-subtree.sh
> @@ -633,7 +633,7 @@ process_split_commit () {
>   else
>   # processing commit without normal parent information;
>   # fetch from repo
> - parents=$(git show -s --pretty=%P "$rev")
> + parents=$(git rev-parse "$rev^@")
>   extracount=$(($extracount + 1))
>   fi
> 
> --
> 2.19.1



[PATCH] subtree: performance improvement for finding unexpected parent commits

2018-10-12 Thread Roger Strain
After testing a previous patch at larger scale, a performance issue was
detected when using git show to locate parent revisions, with a single
run of the git show command taking 2 seconds or longer in a complex repo.
When the command is required tens or hundreds of times in a run of the
script, the additional wait time is unaccepatable. Replacing the command
with git rev-parse resulted in significantly increased performance, with
the command in question returning instantly.

Signed-off-by: Roger Strain 
Thanks-to: Junio C Hamano 
---
 contrib/subtree/git-subtree.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
index 1c157dbd9..147201dc6 100755
--- a/contrib/subtree/git-subtree.sh
+++ b/contrib/subtree/git-subtree.sh
@@ -633,7 +633,7 @@ process_split_commit () {
else
# processing commit without normal parent information;
# fetch from repo
-   parents=$(git show -s --pretty=%P "$rev")
+   parents=$(git rev-parse "$rev^@")
extracount=$(($extracount + 1))
fi
 
-- 
2.19.1



Re: [PATCH] zlib.c: use size_t for size

2018-10-12 Thread Junio C Hamano
Johannes Schindelin  writes:

> Hi Junio,
>
> On Fri, 12 Oct 2018, Junio C Hamano wrote:
>
>> From: Martin Koegler 
>> Date: Thu, 10 Aug 2017 20:13:08 +0200
>> 
>> Signed-off-by: Martin Koegler 
>> Signed-off-by: Junio C Hamano 
>> ---
>> 
>>  * I made minimal adjustments to make the change apply to today's
>>codebase.  I still find some choices and mixing of off_t and
>>size_t done by the patch a bit iffy, and some arith may need to
>>become st_addX().  Extra set of eyes are very much appreciated.
>> 
>>  builtin/pack-objects.c | 10 +-
>>  cache.h| 10 +-
>>  pack-check.c   |  6 +++---
>>  pack.h |  2 +-
>>  packfile.h |  2 +-
>>  wrapper.c  |  8 
>>  zlib.c |  8 
>>  7 files changed, 23 insertions(+), 23 deletions(-)
>> 
>> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
>> index e6316d294d..b9ca04eb8a 100644
>> --- a/builtin/pack-objects.c
>> +++ b/builtin/pack-objects.c
>> @@ -266,15 +266,15 @@ static void copy_pack_data(struct hashfile *f,
>>  struct packed_git *p,
>>  struct pack_window **w_curs,
>>  off_t offset,
>> -off_t len)
>> +size_t len)
>>  {
>>  unsigned char *in;
>> -unsigned long avail;
>> +size_t avail;
>>  
>>  while (len) {
>>  in = use_pack(p, w_curs, offset, );
>>  if (avail > len)
>
> Do we have to be careful to handle sizeof(off_t) != sizeof(size_t) here? I
> guess we would receive a compiler warning about different sizes in that
> case, but it makes me worry.

Yup, you just said exactly the same thing as I already said in the
part you quoted.  I still find the mixed use of off_t and size_t in
this patch iffy, and that was the secondary reason why the patch was
kept in the stalled state for so long.  The primary reason was that
nobody tried to dust it off and reignite the topic so far---which I
am trying to correct, but as I said, this is just minimally adjusted
to today's codebase, without any attempt to improve relative to the
original patch.

I think we are in agreement in that this is not making things worse,
as we are already in the mixed arith territory before this patch.



Re: [PATCHv1 2/3] git-p4: unshelve into refs/remotes/p4-unshelved, not refs/remotes/p4/unshelved

2018-10-12 Thread Junio C Hamano
Luke Diamand  writes:

> The branch detection code looks for branches under refs/remotes/p4/...
> and can end up getting confused if there are unshelved changes in
> there as well. This happens in the function p4BranchesInGit().
>
> Instead, put the unshelved changes into refs/remotes/p4-unshelved/.

I am not a p4 user (and not a git-p4 user), so it is a bit hard for
me to assess if this is a backward incompatibile change and if so
how serious potential breakage to existing users would be.

>  
> -If the target branch in refs/remotes/p4/unshelved already exists, the old 
> one will
> +If the target branch in refs/remotes/p4-unshelved already exists, the old 
> one will
>  be renamed.
>  
>  
>  $ git p4 sync
>  $ git p4 unshelve 12345
> -$ git show refs/remotes/p4/unshelved/12345
> +$ git show p4/unshelved/12345

Isn't this "p4-unshelved/12345" now?



Re: [PATCH v4] branch: introduce --show-current display option

2018-10-12 Thread Eric Sunshine
On Fri, Oct 12, 2018 at 9:34 AM Daniels Umanovskis
 wrote:
> When called with --show-current, git branch will print the current
> branch name and terminate. Only the actual name gets printed,
> without refs/heads. In detached HEAD state, nothing is output.
>
> Signed-off-by: Daniels Umanovskis 
> ---
> diff --git a/t/t3203-branch-output.sh b/t/t3203-branch-output.sh
> @@ -100,6 +100,49 @@ test_expect_success 'git branch -v pattern does not show 
> branch summaries' '
> +test_expect_success 'git branch `--show-current` works properly when tag 
> exists' '
> +   cat >expect <<-\EOF &&
> +   branch-and-tag-name
> +   EOF
> +   test_when_finished "git branch -D branch-and-tag-name" &&
> +   git checkout -b branch-and-tag-name &&
> +   test_when_finished "git tag -d branch-and-tag-name" &&
> +   git tag branch-and-tag-name &&
> +   git branch --show-current >actual &&
> +   git checkout branch-one &&

This cleanup "checkout" needs to be encapsulated within a
test_when_finished(), doesn't it? Preferably just after the "git
checkout -b" invocation.

> +   test_cmp expect actual
> +'


Re: What's cooking in git.git (Oct 2018, #01; Wed, 10)

2018-10-12 Thread Junio C Hamano
Phillip Wood  writes:

> It would be nice if the parsing used starts_with(option_name, user_text)
> rather than strcmp() as well. Also I think --color-moved=no is valid as
> a synonym of --no-color-moved but --color-moved-ws=no is not supported.

I am not sure about starts_with().  Do you mean we should accept
"--color-mo", as that is a prefix of "--color-moved" that is not
shared with any existing option, until we gain a different option
"--color-more"?

If you mean "--color-moved-ws=no" (or "--no-color-moved-ws") as a
way to countermand an earlier --color-moved-ws= on the
command line, I fully agree that it is a good idea.


[PATCH v4] branch: introduce --show-current display option

2018-10-12 Thread Daniels Umanovskis
When called with --show-current, git branch will print the current
branch name and terminate. Only the actual name gets printed,
without refs/heads. In detached HEAD state, nothing is output.

Intended both for scripting and interactive/informative use.
Unlike git branch --list, no filtering is needed to just get the
branch name.

Signed-off-by: Daniels Umanovskis 
---

Compared to v3, fixed up test cases according to Junio's input

 Documentation/git-branch.txt |  6 +-
 builtin/branch.c | 25 +++--
 t/t3203-branch-output.sh | 43 +++
 3 files changed, 71 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt
index bf5316ffa..0babb9b1b 100644
--- a/Documentation/git-branch.txt
+++ b/Documentation/git-branch.txt
@@ -9,7 +9,7 @@ SYNOPSIS
 
 [verse]
 'git branch' [--color[=] | --no-color] [-r | -a]
-   [--list] [-v [--abbrev= | --no-abbrev]]
+   [--list] [--show-current] [-v [--abbrev= | --no-abbrev]]
[--column[=] | --no-column] [--sort=]
[(--merged | --no-merged) []]
[--contains []]
@@ -160,6 +160,10 @@ This option is only applicable in non-verbose mode.
branch --list 'maint-*'`, list only the branches that match
the pattern(s).
 
+--show-current::
+   Print the name of the current branch. In detached HEAD state,
+   nothing is printed.
+
 -v::
 -vv::
 --verbose::
diff --git a/builtin/branch.c b/builtin/branch.c
index c396c4153..46f91dc06 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -443,6 +443,21 @@ static void print_ref_list(struct ref_filter *filter, 
struct ref_sorting *sortin
free(to_free);
 }
 
+static void print_current_branch_name(void)
+{
+   int flags;
+   const char *refname = resolve_ref_unsafe("HEAD", 0, NULL, );
+   const char *shortname;
+   if (!refname)
+   die(_("could not resolve HEAD"));
+   else if (!(flags & REF_ISSYMREF))
+   return;
+   else if (skip_prefix(refname, "refs/heads/", ))
+   puts(shortname);
+   else
+   die(_("HEAD (%s) points outside of refs/heads/"), refname);
+}
+
 static void reject_rebase_or_bisect_branch(const char *target)
 {
struct worktree **worktrees = get_worktrees(0);
@@ -581,6 +596,7 @@ static int edit_branch_description(const char *branch_name)
 int cmd_branch(int argc, const char **argv, const char *prefix)
 {
int delete = 0, rename = 0, copy = 0, force = 0, list = 0;
+   int show_current = 0;
int reflog = 0, edit_description = 0;
int quiet = 0, unset_upstream = 0;
const char *new_upstream = NULL;
@@ -620,6 +636,7 @@ int cmd_branch(int argc, const char **argv, const char 
*prefix)
OPT_BIT('c', "copy", , N_("copy a branch and its reflog"), 
1),
OPT_BIT('C', NULL, , N_("copy a branch, even if target 
exists"), 2),
OPT_BOOL('l', "list", , N_("list branch names")),
+   OPT_BOOL(0, "show-current", _current, N_("show current 
branch name")),
OPT_BOOL(0, "create-reflog", , N_("create the branch's 
reflog")),
OPT_BOOL(0, "edit-description", _description,
 N_("edit the description for the branch")),
@@ -662,14 +679,15 @@ int cmd_branch(int argc, const char **argv, const char 
*prefix)
argc = parse_options(argc, argv, prefix, options, builtin_branch_usage,
 0);
 
-   if (!delete && !rename && !copy && !edit_description && !new_upstream 
&& !unset_upstream && argc == 0)
+   if (!delete && !rename && !copy && !edit_description && !new_upstream &&
+   !show_current && !unset_upstream && argc == 0)
list = 1;
 
if (filter.with_commit || filter.merge != REF_FILTER_MERGED_NONE || 
filter.points_at.nr ||
filter.no_commit)
list = 1;
 
-   if (!!delete + !!rename + !!copy + !!new_upstream +
+   if (!!delete + !!rename + !!copy + !!new_upstream + !!show_current +
list + unset_upstream > 1)
usage_with_options(builtin_branch_usage, options);
 
@@ -697,6 +715,9 @@ int cmd_branch(int argc, const char **argv, const char 
*prefix)
if (!argc)
die(_("branch name required"));
return delete_branches(argc, argv, delete > 1, filter.kind, 
quiet);
+   } else if (show_current) {
+   print_current_branch_name();
+   return 0;
} else if (list) {
/*  git branch --local also shows HEAD when it is detached */
if ((filter.kind & FILTER_REFS_BRANCHES) && filter.detached)
diff --git a/t/t3203-branch-output.sh b/t/t3203-branch-output.sh
index ee6787614..1bf708dff 100755
--- a/t/t3203-branch-output.sh
+++ b/t/t3203-branch-output.sh
@@ -100,6 +100,49 @@ test_expect_success 'git branch -v pattern does not 

[PATCH v3 1/2] rebase -i: clarify what happens on a failed `exec`

2018-10-12 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

We had not documented previously what happens when an `exec` command in
an interactive rebase fails. Now we do.

Signed-off-by: Johannes Schindelin 
---
 Documentation/git-rebase.txt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 091eb53faa..d9771bd25b 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -420,7 +420,8 @@ idea unless you know what you are doing (see BUGS below).
 --exec ::
Append "exec " after each line creating a commit in the
final history.  will be interpreted as one or more shell
-   commands.
+   commands. Any command that fails will interrupt the rebase,
+   with exit code 1.
 +
 You may execute several commands by either using one instance of `--exec`
 with several commands:
-- 
gitgitgadget



[PATCH v3 0/2] rebase -i: introduce the 'break' command

2018-10-12 Thread Johannes Schindelin via GitGitGadget
Stefan asked a while back
[https://public-inbox.org/git/20180118183618.39853-3-sbel...@google.com/] 
for a todo command in interactive rebases that would essentially perform the
"stop" part of the edit command, but without the "pick" part: interrupt the
interactive rebase, with exit code 0, letting the user do things and stuff
and look around, with the idea of continuing the rebase later (using git
rebase --continue).

This patch introduces that, based on ag/rebase-i-in-c.

Changes since v2:

 * Fixed two typos.
 * When interrupting the rebase, break now outputs a message.

Changes since v1:

 * Wrapped the commit message correctly.
 * Added a preparatory patch to mention what happens in case an exec fails.
 * Mentioned the break command in the git-rebase(1) documentation.

Cc: Stefan Beller sbel...@google.com [sbel...@google.com]Cc: Eric Sunshine 
sunsh...@sunshineco.com [sunsh...@sunshineco.com]

Johannes Schindelin (2):
  rebase -i: clarify what happens on a failed `exec`
  rebase -i: introduce the 'break' command

 Documentation/git-rebase.txt |  6 +-
 rebase-interactive.c |  1 +
 sequencer.c  | 24 +++-
 t/lib-rebase.sh  |  2 +-
 t/t3418-rebase-continue.sh   |  9 +
 5 files changed, 39 insertions(+), 3 deletions(-)


base-commit: 34b47315d9721a576b9536492cca0c11588113a2
Published-As: 
https://github.com/gitgitgadget/git/releases/tags/pr-43%2Fdscho%2Frebase-i-break-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git 
pr-43/dscho/rebase-i-break-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/43

Range-diff vs v2:

 1:  2eefdb4874 ! 1:  92512a82d2 rebase -i: clarify what happens on a failed 
`exec`
 @@ -15,8 +15,8 @@
Append "exec " after each line creating a commit in the
final history.  will be interpreted as one or more shell
  - commands.
 -+ commands. Anz command that fails will interrupt the rebase,
 -+ withe exit code 1.
 ++ commands. Any command that fails will interrupt the rebase,
 ++ with exit code 1.
   +
   You may execute several commands by either using one instance of `--exec`
   with several commands:
 2:  c74e02c4b6 ! 2:  d44b425709 rebase -i: introduce the 'break' command
 @@ -71,13 +71,37 @@
if (bol != eol)
return error(_("%s does not accept arguments: '%s'"),
 command_to_string(item->command), bol);
 +@@
 +  return update_ref(NULL, "ORIG_HEAD", , NULL, 0, 
UPDATE_REFS_MSG_ON_ERR);
 + }
 + 
 ++static int stopped_at_head(void)
 ++{
 ++ struct object_id head;
 ++ struct commit *commit;
 ++ struct commit_message message;
 ++
 ++ if (get_oid("HEAD", ) || !(commit = lookup_commit()) ||
 ++ parse_commit(commit) || get_message(commit, ))
 ++ fprintf(stderr, _("Stopped at HEAD\n"));
 ++ else {
 ++ fprintf(stderr, _("Stopped at %s\n"), message.label);
 ++ free_message(commit, );
 ++ }
 ++ return 0;
 ++
 ++}
 ++
 + static const char rescheduled_advice[] =
 + N_("Could not execute the todo command\n"
 + "\n"
  @@
unlink(rebase_path_stopped_sha());
unlink(rebase_path_amend());
delete_ref(NULL, "REBASE_HEAD", NULL, REF_NO_DEREF);
  +
  + if (item->command == TODO_BREAK)
 -+ break;
 ++ return stopped_at_head();
}
if (item->command <= TODO_SQUASH) {
if (is_rebase_i(opts))

-- 
gitgitgadget


[PATCH v3 2/2] rebase -i: introduce the 'break' command

2018-10-12 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

The 'edit' command can be used to cherry-pick a commit and then
immediately drop out of the interactive rebase, with exit code 0, to let
the user amend the commit, or test it, or look around.

Sometimes this functionality would come in handy *without*
cherry-picking a commit, e.g. to interrupt the interactive rebase even
before cherry-picking a commit, or immediately after an 'exec' or a
'merge'.

This commit introduces that functionality, as the spanking new 'break'
command.

Suggested-by: Stefan Beller 
Signed-off-by: Johannes Schindelin 
---
 Documentation/git-rebase.txt |  3 +++
 rebase-interactive.c |  1 +
 sequencer.c  | 24 +++-
 t/lib-rebase.sh  |  2 +-
 t/t3418-rebase-continue.sh   |  9 +
 5 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index d9771bd25b..6b71694b0d 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -561,6 +561,9 @@ By replacing the command "pick" with the command "edit", 
you can tell
 the files and/or the commit message, amend the commit, and continue
 rebasing.
 
+To interrupt the rebase (just like an "edit" command would do, but without
+cherry-picking any commit first), use the "break" command.
+
 If you just want to edit the commit message for a commit, replace the
 command "pick" with the command "reword".
 
diff --git a/rebase-interactive.c b/rebase-interactive.c
index 0f4119cbae..78f3263fc1 100644
--- a/rebase-interactive.c
+++ b/rebase-interactive.c
@@ -14,6 +14,7 @@ void append_todo_help(unsigned edit_todo, unsigned keep_empty,
 "s, squash  = use commit, but meld into previous commit\n"
 "f, fixup  = like \"squash\", but discard this commit's log message\n"
 "x, exec  = run command (the rest of the line) using shell\n"
+"b, break = stop here (continue rebase later with 'git rebase --continue')\n"
 "d, drop  = remove commit\n"
 "l, label  = label current HEAD with a name\n"
 "t, reset  = reset HEAD to a label\n"
diff --git a/sequencer.c b/sequencer.c
index 8dd6db5a01..ee3961ec63 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1416,6 +1416,7 @@ enum todo_command {
TODO_SQUASH,
/* commands that do something else than handling a single commit */
TODO_EXEC,
+   TODO_BREAK,
TODO_LABEL,
TODO_RESET,
TODO_MERGE,
@@ -1437,6 +1438,7 @@ static struct {
{ 'f', "fixup" },
{ 's', "squash" },
{ 'x', "exec" },
+   { 'b', "break" },
{ 'l', "label" },
{ 't', "reset" },
{ 'm', "merge" },
@@ -1964,7 +1966,7 @@ static int parse_insn_line(struct todo_item *item, const 
char *bol, char *eol)
padding = strspn(bol, " \t");
bol += padding;
 
-   if (item->command == TODO_NOOP) {
+   if (item->command == TODO_NOOP || item->command == TODO_BREAK) {
if (bol != eol)
return error(_("%s does not accept arguments: '%s'"),
 command_to_string(item->command), bol);
@@ -3247,6 +3249,23 @@ static int checkout_onto(struct replay_opts *opts,
return update_ref(NULL, "ORIG_HEAD", , NULL, 0, 
UPDATE_REFS_MSG_ON_ERR);
 }
 
+static int stopped_at_head(void)
+{
+   struct object_id head;
+   struct commit *commit;
+   struct commit_message message;
+
+   if (get_oid("HEAD", ) || !(commit = lookup_commit()) ||
+   parse_commit(commit) || get_message(commit, ))
+   fprintf(stderr, _("Stopped at HEAD\n"));
+   else {
+   fprintf(stderr, _("Stopped at %s\n"), message.label);
+   free_message(commit, );
+   }
+   return 0;
+
+}
+
 static const char rescheduled_advice[] =
 N_("Could not execute the todo command\n"
 "\n"
@@ -3293,6 +3312,9 @@ static int pick_commits(struct todo_list *todo_list, 
struct replay_opts *opts)
unlink(rebase_path_stopped_sha());
unlink(rebase_path_amend());
delete_ref(NULL, "REBASE_HEAD", NULL, REF_NO_DEREF);
+
+   if (item->command == TODO_BREAK)
+   return stopped_at_head();
}
if (item->command <= TODO_SQUASH) {
if (is_rebase_i(opts))
diff --git a/t/lib-rebase.sh b/t/lib-rebase.sh
index 25a77ee5cb..584604ee63 100644
--- a/t/lib-rebase.sh
+++ b/t/lib-rebase.sh
@@ -49,7 +49,7 @@ set_fake_editor () {
case $line in
squash|fixup|edit|reword|drop)
action="$line";;
-   exec*)
+   exec*|break)
echo "$line" | sed 's/_/ /g' >> "$1";;
"#")
echo '# comment' >> "$1";;
diff --git a/t/t3418-rebase-continue.sh b/t/t3418-rebase-continue.sh
index c145dbac38..185a491089 100755
--- a/t/t3418-rebase-continue.sh
+++ 

[PATCH] object_id.cocci: match only expressions of type 'struct object_id'

2018-10-12 Thread SZEDER Gábor
Most of our semantic patches in 'contrib/coccinelle/object_id.cocci'
turn calls of SHA1-specific functions into calls of their
corresponding object_id counterparts, e.g. sha1_to_hex() to
oid_to_hex().  These semantic patches look something like this:

  @@
  expression E1;
  @@
  - sha1_to_hex(E1.hash)
  + oid_to_hex()

and match the access to the 'hash' field in any data type, not only in
'struct object_id', and, consquently, can produce wrong
transformations.

Case in point is the recent hash function transition patch "rerere:
convert to use the_hash_algo" [1], which, among other things, renamed
'struct rerere_dir's 'sha1' field to 'hash', and then 'make
coccicheck' started to suggest the following wrong transformations for
'rerere.c' [2]:

  -return sha1_to_hex(id->collection->hash);
  +return oid_to_hex(id->collection);

and

  -DIR *dir = opendir(git_path("rr-cache/%s", sha1_to_hex(rr_dir->hash)));
  +DIR *dir = opendir(git_path("rr-cache/%s", oid_to_hex(rr_dir)));

Avoid such wrong transformations by tightening semantic patches in
'object_id.cocci' to match only type of or pointers to 'struct
object_id'.

[1] 
https://public-inbox.org/git/20181008215701.779099-15-sand...@crustytoothpaste.net/
[2] https://travis-ci.org/git/git/jobs/440463476#L580

Signed-off-by: SZEDER Gábor 
---

I think this patch should come before that patch mentioned in the commit
message, or perhaps even before the whole patch series.

 contrib/coccinelle/object_id.cocci | 117 -
 1 file changed, 63 insertions(+), 54 deletions(-)

diff --git a/contrib/coccinelle/object_id.cocci 
b/contrib/coccinelle/object_id.cocci
index d8bdb48712..6a7cf3e02d 100644
--- a/contrib/coccinelle/object_id.cocci
+++ b/contrib/coccinelle/object_id.cocci
@@ -1,119 +1,127 @@
 @@
-expression E1;
+struct object_id OID;
 @@
-- is_null_sha1(E1.hash)
-+ is_null_oid()
+- is_null_sha1(OID.hash)
++ is_null_oid()
 
 @@
-expression E1;
+struct object_id *OIDPTR;
 @@
-- is_null_sha1(E1->hash)
-+ is_null_oid(E1)
+- is_null_sha1(OIDPTR->hash)
++ is_null_oid(OIDPTR)
 
 @@
-expression E1;
+struct object_id OID;
 @@
-- sha1_to_hex(E1.hash)
-+ oid_to_hex()
+- sha1_to_hex(OID.hash)
++ oid_to_hex()
 
 @@
 identifier f != oid_to_hex;
-expression E1;
+struct object_id *OIDPTR;
 @@
   f(...) {<...
-- sha1_to_hex(E1->hash)
-+ oid_to_hex(E1)
+- sha1_to_hex(OIDPTR->hash)
++ oid_to_hex(OIDPTR)
   ...>}
 
 @@
-expression E1, E2;
+expression E;
+struct object_id OID;
 @@
-- sha1_to_hex_r(E1, E2.hash)
-+ oid_to_hex_r(E1, )
+- sha1_to_hex_r(E, OID.hash)
++ oid_to_hex_r(E, )
 
 @@
 identifier f != oid_to_hex_r;
-expression E1, E2;
+expression E;
+struct object_id *OIDPTR;
 @@
f(...) {<...
-- sha1_to_hex_r(E1, E2->hash)
-+ oid_to_hex_r(E1, E2)
+- sha1_to_hex_r(E, OIDPTR->hash)
++ oid_to_hex_r(E, OIDPTR)
   ...>}
 
 @@
-expression E1;
+struct object_id OID;
 @@
-- hashclr(E1.hash)
-+ oidclr()
+- hashclr(OID.hash)
++ oidclr()
 
 @@
 identifier f != oidclr;
-expression E1;
+struct object_id *OIDPTR;
 @@
   f(...) {<...
-- hashclr(E1->hash)
-+ oidclr(E1)
+- hashclr(OIDPTR->hash)
++ oidclr(OIDPTR)
   ...>}
 
 @@
-expression E1, E2;
+struct object_id OID1, OID2;
 @@
-- hashcmp(E1.hash, E2.hash)
-+ oidcmp(, )
+- hashcmp(OID1.hash, OID2.hash)
++ oidcmp(, )
 
 @@
 identifier f != oidcmp;
-expression E1, E2;
+struct object_id *OIDPTR1, OIDPTR2;
 @@
   f(...) {<...
-- hashcmp(E1->hash, E2->hash)
-+ oidcmp(E1, E2)
+- hashcmp(OIDPTR1->hash, OIDPTR2->hash)
++ oidcmp(OIDPTR1, OIDPTR2)
   ...>}
 
 @@
-expression E1, E2;
+struct object_id *OIDPTR;
+struct object_id OID;
 @@
-- hashcmp(E1->hash, E2.hash)
-+ oidcmp(E1, )
+- hashcmp(OIDPTR->hash, OID.hash)
++ oidcmp(OIDPTR, )
 
 @@
-expression E1, E2;
+struct object_id *OIDPTR;
+struct object_id OID;
 @@
-- hashcmp(E1.hash, E2->hash)
-+ oidcmp(, E2)
+- hashcmp(OID.hash, OIDPTR->hash)
++ oidcmp(, OIDPTR)
 
 @@
-expression E1, E2;
+struct object_id OID1, OID2;
 @@
-- hashcpy(E1.hash, E2.hash)
-+ oidcpy(, )
+- hashcpy(OID1.hash, OID2.hash)
++ oidcpy(, )
 
 @@
 identifier f != oidcpy;
-expression E1, E2;
+struct object_id *OIDPTR1;
+struct object_id *OIDPTR2;
 @@
   f(...) {<...
-- hashcpy(E1->hash, E2->hash)
-+ oidcpy(E1, E2)
+- hashcpy(OIDPTR1->hash, OIDPTR2->hash)
++ oidcpy(OIDPTR1, OIDPTR2)
   ...>}
 
 @@
-expression E1, E2;
+struct object_id *OIDPTR;
+struct object_id OID;
 @@
-- hashcpy(E1->hash, E2.hash)
-+ oidcpy(E1, )
+- hashcpy(OIDPTR->hash, OID.hash)
++ oidcpy(OIDPTR, )
 
 @@
-expression E1, E2;
+struct object_id *OIDPTR;
+struct object_id OID;
 @@
-- hashcpy(E1.hash, E2->hash)
-+ oidcpy(, E2)
+- hashcpy(OID.hash, OIDPTR->hash)
++ oidcpy(, OIDPTR)
 
 @@
-expression E1, E2;
+struct object_id *OIDPTR1;
+struct object_id *OIDPTR2;
 @@
-- oidcmp(E1, E2) == 0
-+ oideq(E1, E2)
+- oidcmp(OIDPTR1, OIDPTR2) == 0
++ oideq(OIDPTR1, OIDPTR2)
 
 @@
 identifier f != hasheq;
@@ -125,10 +133,11 @@ expression E1, E2;
   ...>}
 
 @@
-expression E1, E2;
+struct object_id *OIDPTR1;
+struct object_id *OIDPTR2;
 @@
-- 

Git Test Coverage Report (Friday, Oct 12)

2018-10-12 Thread Derrick Stolee



In an effort to ensure new code is reasonably covered by the test suite, 
we now have contrib/coverage-diff.sh to combine the gcov output from 
'make coverage-test ; make coverage-report' with the output from 'git 
diff A B' to discover _new_ lines of code that are not covered.


This report takes the output of these results after running on four 
branches:


        pu: 13cafb433ca0dd31b3ea229a79cc8507aa89ee19 (tests are broken, 
so coverage not reported)


       jch: fda196ac82002ede984896861cf28a354044d1a0

      next: 76f2f5c1e34c4dbef1029e2984c2892894c444ce

    master: 04861070400d3ca9d487bd0d736ca818b9ffe371

 maint: cae598d9980661a978e2df4fb338518f7bf09572

I ran the test suite on each of these branches on an Ubuntu Linux VM, 
and I'm missing some dependencies (like apache, svn, and perforce) so 
not all tests are run.


I submit this output without comment. I'm taking a look especially at my 
own lines to see where coverage can be improved.


Thanks,

-Stolee

Uncovered code new in jch, compared to next. Build logs at [1]
--

builtin/archive.c
e001fd3a50 builtin/archive.c  78) die(_("git archive: expected ACK/NAK, 
got a flush packet"));

e001fd3a50 builtin/archive.c  80) if (starts_with(reader.line, "NACK "))
e001fd3a50 builtin/archive.c  81) die(_("git archive: NACK %s"), 
reader.line + 5);

e001fd3a50 builtin/archive.c  82) if (starts_with(reader.line, "ERR "))
e001fd3a50 builtin/archive.c  83) die(_("remote error: %s"), reader.line 
+ 4);

e001fd3a50 builtin/archive.c  84) die(_("git archive: protocol error"));
e001fd3a50 builtin/archive.c  89) die(_("git archive: expected a flush"));
fb19d32f05 builtin/archive.c  99) if (version != discover_version())
fb19d32f05 builtin/archive.c 100) die(_("git archive: received different 
protocol versions in subsequent requests"));


builtin/fetch.c
7bbc53a589 builtin/fetch.c  385) continue; /* can this happen??? */

builtin/rebase--interactive.c
53bbcfbde7 builtin/rebase--interactive2.c  24) return error(_("no HEAD?"));
53bbcfbde7 builtin/rebase--interactive2.c  51) return 
error_errno(_("could not create temporary %s"), path_state_dir());
53bbcfbde7 builtin/rebase--interactive2.c  57) return 
error_errno(_("could not mark as interactive"));

53bbcfbde7 builtin/rebase--interactive2.c  77) return -1;
53bbcfbde7 builtin/rebase--interactive2.c  81) return -1;
53bbcfbde7 builtin/rebase--interactive2.c  87) free(revisions);
53bbcfbde7 builtin/rebase--interactive2.c  88) free(shortrevisions);
53bbcfbde7 builtin/rebase--interactive2.c  90) return -1;
53bbcfbde7 builtin/rebase--interactive2.c  98) free(revisions);
53bbcfbde7 builtin/rebase--interactive2.c  99) free(shortrevisions);
53bbcfbde7 builtin/rebase--interactive2.c 101) return 
error_errno(_("could not open %s"), rebase_path_todo());
53bbcfbde7 builtin/rebase--interactive2.c 106) 
argv_array_push(_script_args, restrict_revision);
53bbcfbde7 builtin/rebase--interactive2.c 114) error(_("could not 
generate todo list"));
53bbcfbde7 builtin/rebase--interactive2.c 206) 
usage_with_options(builtin_rebase_interactive_usage, options);
53bbcfbde7 builtin/rebase--interactive2.c 220) 
warning(_("--[no-]rebase-cousins has no effect without "
0af129b2ed builtin/rebase--interactive2.c 226) die(_("a base commit must 
be provided with --upstream or --onto"));

34b47315d9 builtin/rebase--interactive.c  261) ret = rearrange_squash();
34b47315d9 builtin/rebase--interactive.c  262) break;
34b47315d9 builtin/rebase--interactive.c  264) ret = 
sequencer_add_exec_commands(cmd);

34b47315d9 builtin/rebase--interactive.c  265) break;
0af129b2ed builtin/rebase--interactive2.c 267) BUG("invalid command 
'%d'", command);


builtin/rebase.c
55071ea248   61) strbuf_trim();
55071ea248   62) ret = !strcmp("true", out.buf);
55071ea248   63) strbuf_release();
002ee2fe68  115) die(_("%s requires an interactive rebase"), option);
f95736288a  148) return error_errno(_("could not read '%s'"), path);
f95736288a  162) return -1;
f95736288a  167) return error(_("could not get 'onto': '%s'"), buf.buf);
f95736288a  178) return -1;
f95736288a  179) } else if (read_one(state_dir_path("head", opts), ))
f95736288a  180) return -1;
f95736288a  182) return error(_("invalid orig-head: '%s'"), buf.buf);
f95736288a  186) return -1;
f95736288a  188) opts->flags &= ~REBASE_NO_QUIET;
73d51ed0a5  196) opts->signoff = 1;
73d51ed0a5  197) opts->flags |= REBASE_FORCE;
ead98c111b  204) return -1;
12026a412c  219) return -1;
ba1905a5fe  227) return -1;
ba1905a5fe  235) return -1;
6defce2b02  255) return error(_("Could not read '%s'"), path);
6defce2b02  271) res = error(_("Cannot store %s"), autostash.buf);
6defce2b02  275) return res;
bc24382c2b  373) argv_array_pushf(,
bc24382c2b  375) oid_to_hex(>restrict_revision->object.oid));
ac7f467fef  488) BUG("Unhandled rebase type %d", opts->type);
ac7f467fef  507) struct strbuf dir = STRBUF_INIT;
6defce2b02  509) apply_autostash(opts);
ac7f467fef  510) 

Re: [PATCH v3 4/7] revision.c: begin refactoring --topo-order logic

2018-10-12 Thread Derrick Stolee

On 10/12/2018 2:33 AM, Junio C Hamano wrote:

"Derrick Stolee via GitGitGadget"  writes:


* revs->limited implies we run limit_list() to walk the entire
   reachable set. There are some short-cuts here, such as if we
   perform a range query like 'git rev-list COMPARE..HEAD' and we
   can stop limit_list() when all queued commits are uninteresting.

* revs->topo_order implies we run sort_in_topological_order(). See
   the implementation of that method in commit.c. It implies that
   the full set of commits to order is in the given commit_list.

These two methods imply that a 'git rev-list --topo-order HEAD'
command must walk the entire reachable set of commits _twice_ before
returning a single result.

With or without "--topo-order", running rev-list without any
negative commit means we must dig down to the roots that can be
reached from the positive commits we have.
If we use default order in 'git log', we don't walk all the way to the 
root commits, and instead trust the commit-date. (This is different than 
--date-order, which does guarantee parents after children.) In this 
case, revs->limited is false.

I am to sure if having to run the "sort" of order N counts as "walk
the entire reachable set once" (in addition to the enumeration that
must be done to prepare that N commits, performed in limit_list()).


sort_in_topological_order() does actually _two_ walks (the in-degree 
computation plus the walk that peels commits of in-degree zero), but 
those walks are cheaper because we've already parsed the commits in 
limit_list().

3. expand_topo_walk() provides get_revision_1() with a way to signal
walking beyond the latest commit. Currently, this calls
add_parents_to_list() exactly like the old logic.

"latest"?  We dig down the history from newer to older, so at some
point we hit an old commit and need to find the parents to keep
walking towards even older parts of the history.  Did you mean
"earliest" instead?
I mean "latest" in terms of the algorithm, so "the commit that was 
returned by get_revision_1() most recently". This could use some 
rewriting for clarity.
  
@@ -2454,7 +2455,7 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s

if (revs->diffopt.objfind)
revs->simplify_history = 0;
  
-	if (revs->topo_order)

+   if (revs->topo_order && !generation_numbers_enabled(the_repository))
revs->limited = 1;

Are we expecting that this is always a bool?  Can there be new
commits for which generation numbers are not computed and stored
while all the old, stable and packed commits have generation
numbers?


For this algorithm to work, we only care that _some_ commits have 
generation numbers. We expect that if a commit-graph file exists with 
generation numbers, then the majority of commits have generation 
numbers. The commits that were added or fetched since the commit-graph 
was written will have generation number INFINITY, but the topo-order 
algorithm will still work and be efficient in those cases. (This is also 
why we have the "half graph" case in test_three_modes.)



@@ -2892,6 +2893,33 @@ static int mark_uninteresting(const struct object_id 
*oid,
return 0;
  }
  
+struct topo_walk_info {};

+
+static void init_topo_walk(struct rev_info *revs)
+{
+   struct topo_walk_info *info;
+   revs->topo_walk_info = xmalloc(sizeof(struct topo_walk_info));
+   info = revs->topo_walk_info;
+   memset(info, 0, sizeof(struct topo_walk_info));

There is no member in the struct at this point.  Are we sure this is
safe?  Just being curious.  I know xmalloc() gives us at least one
byte and info won't be NULL.  I just do not know offhand if we have
a guarantee that memset() acts sensibly to fill the first 0 bytes.
This is a good question. It seems to work for me when I check out your 
version of this commit (6c04ff30 "revision.c: begin refactoring 
--topo-order logic") and run all tests.

+   limit_list(revs);
+   sort_in_topological_order(>commits, revs->sort_order);
+}
+
+static struct commit *next_topo_commit(struct rev_info *revs)
+{
+   return pop_commit(>commits);
+}
+
+static void expand_topo_walk(struct rev_info *revs, struct commit *commit)
+{
+   if (add_parents_to_list(revs, commit, >commits, NULL) < 0) {
+   if (!revs->ignore_missing_links)
+   die("Failed to traverse parents of commit %s",
+   oid_to_hex(>object.oid));
+   }
+}
+
  int prepare_revision_walk(struct rev_info *revs)
  {
int i;
@@ -2928,11 +2956,13 @@ int prepare_revision_walk(struct rev_info *revs)
commit_list_sort_by_date(>commits);
if (revs->no_walk)
return 0;
-   if (revs->limited)
+   if (revs->limited) {
if (limit_list(revs) < 0)
return -1;
-   if (revs->topo_order)
-   sort_in_topological_order(>commits, revs->sort_order);
+   

Re: [PATCH 04/15] sequencer: refactor sequencer_add_exec_commands() to work on a todo_list

2018-10-12 Thread Alban Gruin
Le 12/10/2018 à 11:54, Phillip Wood a écrit :
> On 11/10/2018 17:57, Alban Gruin wrote:
> > Hi Phillip,
> > 
> > thanks for taking the time to review my patches.
> > 
> > Le 11/10/2018 à 13:25, Phillip Wood a écrit :
> >> On 07/10/2018 20:54, Alban Gruin wrote:
> >>> @@ -4419,15 +4406,38 @@ int sequencer_add_exec_commands(const char
> >>> *commands)
> >>>   }
> >>> /* insert or append final  */
> >>> -if (insert >= 0 && insert < todo_list.nr)
> >>> -strbuf_insert(buf, todo_list.items[insert].offset_in_buf +
> >>> +if (insert >= 0 && insert < todo_list->nr)
> >>> +strbuf_insert(buf, todo_list->items[insert].offset_in_buf +
> >>> offset, commands, commands_len);
> >>>   else if (insert >= 0 || !offset)
> >>>   strbuf_add(buf, commands, commands_len);
> >>>   -i = write_message(buf->buf, buf->len, todo_file, 0);
> >>> +if (todo_list_parse_insn_buffer(buf->buf, todo_list))
> >>> +BUG("unusable todo list");}
> >> 
> >> It is a shame to have to re-parse the todo list, I wonder how difficult
> >> it would be to adjust the todo_list item array as the exec commands are
> >> inserted. The same applies to the next couple of patches
> > 
> > Good question.
> > 
> > This function inserts an `exec' command after every `pick' command.
> > These commands are stored in a dynamically allocated list, grew with
> > ALLOW_GROW().
> > 
> > If we want to keep the current structure, we would have to grow the size
> > of the list by 1 and move several element to the end every time we want
> > to add an `exec' command.  It would not be very effective.  Perhaps I
> > should use a linked list here, instead.  It may also work well with
> > rearrange_squash() and skip_unnecessary_picks().
> > 
> > Maybe we could even get rid of the strbuf at some point.
> 
> Another way would be to use the strbuf as a string pool rather than a
> copy of the text of the file. There could be a write_todo_list()
> function that takes a todo list and some flags, iterates over the items
> in the todo list and writes the file. The flags would specify whether to
> append the todo help and whether to abbreviate the object ids (so there
> is no need for a separate call to transform_todos()). Then
> add_exec_commands() could allocate a new array of todo items which it
> builds up as it works through the original list and replaces the
> original list with the new one at the end. The text of the exec items
> can be added to the end of the strbuf (we only need one copy of the exec
> text with this scheme). rearrange_squash() can use a temporary array to
> build a new list as well or just memmove() things but that might be
> slower if there is a lot of rearrangement to do. skip_unecessary_picks()
> could just set the current item to the one we want to start with.
> 

This sounds good, and it looks like the solution dscho proposed on IRC a few 
hours ago[0].  I will try to do this.

[0] http://colabti.org/irclogger/irclogger_log/git-devel?date=2018-10-12#l46

Cheers,
Alban






URGENT RESPONSE NEEDED

2018-10-12 Thread Sean Kim.
Hello my dear.

Did you receive my email message to you? Please, get back to me ASAP as the 
matter is becoming late.  Expecting your urgent response.

Sean.



Re: [PATCH v4 0/1] contrib: Add script to show uncovered "new" lines

2018-10-12 Thread Derrick Stolee

On 10/11/2018 11:01 PM, Junio C Hamano wrote:

"Derrick Stolee via GitGitGadget"  writes:


CHANGES IN V4: I reduced the blame output using -s which decreases the
width. I include a summary of the commit authors at the end to help people
see the lines they wrote. This version is also copied into a build
definition in the public Git project on Azure Pipelines [1]. I'll use this
build definition to generate the coverage report after each "What's Cooking"
email.

[1] https://git.visualstudio.com/git/_build?definitionId=5

Thanks.  Let's move this forward by merging it down to 'next'.
Sounds good. When it moves into 'master' I can update my build to call 
the script from source.


Thanks,
-Stolee


Re: [PATCH v2 00/18] builtin rebase options

2018-10-12 Thread Johannes Schindelin
Hi Junio,

On Thu, 6 Sep 2018, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" 
> writes:
> 
> > This patch series completes the support for all rebase options in the
> > builtin rebase, e.g. --signoff, rerere-autoupdate, etc.
> >
> > It is based on pk/rebase -in-c-3-acts.
> 
> ... which in turn was based on pk/rebase-in-c-2-basic that just got
> rerolled, so I would assume that you want pk/rebase-in-c-3-acts I
> have rebased on top of the result of applying the updated 2-basic
> series.
> 
> I've rebuilt the collection of topics up to pk/rebase-in-c-6-final
> with these two updated series twice, once doing it manually, like I
> did the last time, and another using "rebase -i -r" on top of the
> updated pk/rebase-in-c-4-opts.  The resulting trees match, of
> course.
> 
> I did it twice to try out how it feels to use "rebase -i -r" because
> I wanted to make sure what we are shipping in 'master' behaves
> sensibly ;-)
> 
> Two things I noticed about the recreation of the merge ...
> 
>   Reminder to bystanders.  We need to merge ag/rebase-i-in-c
>   topic on top of pk/reabse-in-c-5-test topic before applying
>   a patch to adjust rebase to call rebase-i using the latter's
>   new calling convention.  The topics look like
> 
>   - pk/rebase-in-c has three patches on master
>   - pk/rebase-in-c-2-basic builds on it, and being replaced
>   - pk/rebase-in-c-3-acts builds on 2-basic (no update this time)
>   - pk/rebase-in-c-4-opts builds on 3-acts, and being replaced
>   - pk/rebase-in-c-5-test builds on 4-opts (no update this time)
>   - js/rebase-in-c-5.5 builds on 5-test and merges ag/rebase-in-c
> topic before applying one patch on it (no update this time)
>   - pk/rebase-in-c-6-final builds on 5.5 (no update this time)
> 
>   and we are replacing 2-basic with 11 patches and 4-opts with
>   18 patches.
> 
> ... using "rebase -i -r" are that 
> 
>  (1) it rebuilt, or at least offered to rebuild, the entire side
>  branch, even though there is absolutely no need to.  Leaving
>  "pick"s untouched, based on the correct fork point, resulted in
>  all picks fast forwarded, but it was somewhat alarming.

Right. But this is a legacy of our paradigm to script things in Unix shell
script. It not only is slow, error-prone and hard to keep portable, it
also encourages poor design, as you do not have the same expressive power
as C has.

In this case, it harmed us by making it impossible to essentially play out
the rebase in memory and only fall back to writing things into the
worktree upon failure.

However, this is where we want to go. It is still a long way to go,
though, as many code parts are safely in the "we use the worktree to play
out the rebase in its entirety" place.

The "skip_unnecessary_picks" trick is the best we could do so far.

>  (2) "merge -C  ag/rebase-i-in-c" appeared as
>  the insn to merge in the (possibly rebuilt) side branch.  And
>  just like "commit -C", it took the merge message from the
>  original merge commit, which means that the summary of the
>  merged side branch is kept stale.  In this particular case, I
>  did not even want to see ag/rebase-i-in-c topic touched, so I
>  knew I want to keep the original merge summary, but if the user
>  took the offer to rewrite the side branch (e.g. with a "reword"
>  to retitle), using the original merge message would probably
>  disappoint the user.

Right. But the user would then also freely admit that they asked for the
merge commit to be rebased, which is what `--rebase-merges` says.

> I think (1) actually is a feature.  Not everybody is an integrator
> who does not want to touch any commit on the topic branch(es) while
> rebuilding a single-strand-of-pearls that has many commits and an
> occasional merge of the tip of another topic branch.  It's just that
> the feature does not suit the workflow I use when I am playing the
> top-level integrator role.

As I said. The ideal thing would be to invest quite a bit in refactoring
especially the do_pick_commit() function, and then play out the rebase in
memory, where one state variable knows what the "HEAD" is (but the
worktree is left untouched, up until the point when an error occurs, in
which case we want to write out the files). This would also need a major
refactoring of the recursive merge, of course, which conflates the merge
part with the writing of the merge conflicts to disk part.

While I would love to see this happening, I don't think that I can spare
enough time to drive this, at least for a couple of years.

> I am not sure what should be the ideal behaviour for (2).  I would
> imagine that
> 
>  - I do want to keep the original title the merge (e.g. "into
>", if left to "git merge" to come up with the
>title during "rebase -i" session, would be lost and become "into
>HEAD", which is not what we want);
> 
>  - I do want to keep the original commentary 

Re: [PATCH v2 2/2] rebase -i: introduce the 'break' command

2018-10-12 Thread Johannes Schindelin
Hi Phillip,

On Fri, 12 Oct 2018, Phillip Wood wrote:

> Hi Johannes
> On 12/10/2018 09:35, Johannes Schindelin wrote:
> > Hi Phillip,
> > 
> > On Thu, 11 Oct 2018, Phillip Wood wrote:
> > 
> >> I think this would be a useful addition to rebase, there's one small
> >> comment below.
> >>
> >> On 10/10/2018 09:53, Johannes Schindelin via GitGitGadget wrote:
> >>> From: Johannes Schindelin 
> >>>
> >>> The 'edit' command can be used to cherry-pick a commit and then
> >>> immediately drop out of the interactive rebase, with exit code 0, to let
> >>> the user amend the commit, or test it, or look around.
> >>>
> >>> Sometimes this functionality would come in handy *without*
> >>> cherry-picking a commit, e.g. to interrupt the interactive rebase even
> >>> before cherry-picking a commit, or immediately after an 'exec' or a
> >>> 'merge'.
> >>>
> >>> This commit introduces that functionality, as the spanking new 'break'
> >>> command.
> >>>
> >>> Suggested-by: Stefan Beller 
> >>> Signed-off-by: Johannes Schindelin 
> >>> ---
> 
> >>> diff --git a/sequencer.c b/sequencer.c
> >>> index 8dd6db5a01..b209f8af46 100644
> >>> --- a/sequencer.c
> >>> +++ b/sequencer.c
> 
> >>> @@ -3293,6 +3295,9 @@ static int pick_commits(struct todo_list 
> >>> *todo_list, struct replay_opts *opts)
> >>>   unlink(rebase_path_stopped_sha());
> >>>   unlink(rebase_path_amend());
> >>>   delete_ref(NULL, "REBASE_HEAD", NULL, REF_NO_DEREF);
> >>> +
> >>> + if (item->command == TODO_BREAK)
> >>> + break;
> >>
> >> Normally when rebase stops it prints a message to say where it has got
> >> to and how to continue, it might be useful to do the same here
> > 
> > That's a very valid point. I'll think of something.
> 
> I'm not sure what gets left on the screen from the previous picks but
> something to say what the last pick/exec was and maybe what the current
> HEAD is would be useful I think.

I first wanted to do that, imitating the "Stopped at ... "
of an `edit` command, but it *is* now possible that we are on an unborn
branch... which will look a bit funny, I guess, as we construct a very
special place-holder commit to be HEAD in that case.

> One thing has just occurred to me - what does git status say (and does
> it still work - I'm not sure how much parsing it does on the todo and
> done files) if it is run while rebase is stopped on a break command?

I don't think it says anything special, really. It just reports that we're
in the middle of a rebase, listing up to two already-processed and up to
two to-be-processed todo commands.

Ciao,
Dscho


Re: [PATCH v2 2/2] rebase -i: introduce the 'break' command

2018-10-12 Thread Phillip Wood
Hi Johannes
On 12/10/2018 09:35, Johannes Schindelin wrote:
> Hi Phillip,
> 
> On Thu, 11 Oct 2018, Phillip Wood wrote:
> 
>> I think this would be a useful addition to rebase, there's one small
>> comment below.
>>
>> On 10/10/2018 09:53, Johannes Schindelin via GitGitGadget wrote:
>>> From: Johannes Schindelin 
>>>
>>> The 'edit' command can be used to cherry-pick a commit and then
>>> immediately drop out of the interactive rebase, with exit code 0, to let
>>> the user amend the commit, or test it, or look around.
>>>
>>> Sometimes this functionality would come in handy *without*
>>> cherry-picking a commit, e.g. to interrupt the interactive rebase even
>>> before cherry-picking a commit, or immediately after an 'exec' or a
>>> 'merge'.
>>>
>>> This commit introduces that functionality, as the spanking new 'break'
>>> command.
>>>
>>> Suggested-by: Stefan Beller 
>>> Signed-off-by: Johannes Schindelin 
>>> ---

>>> diff --git a/sequencer.c b/sequencer.c
>>> index 8dd6db5a01..b209f8af46 100644
>>> --- a/sequencer.c
>>> +++ b/sequencer.c

>>> @@ -3293,6 +3295,9 @@ static int pick_commits(struct todo_list *todo_list, 
>>> struct replay_opts *opts)
>>> unlink(rebase_path_stopped_sha());
>>> unlink(rebase_path_amend());
>>> delete_ref(NULL, "REBASE_HEAD", NULL, REF_NO_DEREF);
>>> +
>>> +   if (item->command == TODO_BREAK)
>>> +   break;
>>
>> Normally when rebase stops it prints a message to say where it has got
>> to and how to continue, it might be useful to do the same here
> 
> That's a very valid point. I'll think of something.

I'm not sure what gets left on the screen from the previous picks but
something to say what the last pick/exec was and maybe what the current
HEAD is would be useful I think. One thing has just occurred to me -
what does git status say (and does it still work - I'm not sure how much
parsing it does on the todo and done files) if it is run while rebase is
stopped on a break command?

Best Wishes

Phillip


Re: [PATCH 07/15] sequencer: make sequencer_make_script() write its script to a strbuf

2018-10-12 Thread SZEDER Gábor
On Sun, Oct 07, 2018 at 09:54:10PM +0200, Alban Gruin wrote:

> diff --git a/sequencer.c b/sequencer.c
> index 30a7fe3958..dfb8d1c974 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -4083,7 +4083,7 @@ static const char *label_oid(struct object_id *oid, 
> const char *label,
>  }
>  
>  static int make_script_with_merges(struct pretty_print_context *pp,
> -struct rev_info *revs, FILE *out,
> +struct rev_info *revs, struct strbuf *out,
>  unsigned flags)
>  {
>   int keep_empty = flags & TODO_LIST_KEEP_EMPTY;
> @@ -4230,7 +4230,7 @@ static int make_script_with_merges(struct 
> pretty_print_context *pp,
>* gathering commits not yet shown, reversing the list on the fly,
>* then outputting that list (labeling revisions as needed).
>*/
> - fprintf(out, "%s onto\n", cmd_label);
> + strbuf_addf(out, "%s onto\n", cmd_label);
>   for (iter = tips; iter; iter = iter->next) {
>   struct commit_list *list = NULL, *iter2;
>  
> @@ -4240,9 +4240,9 @@ static int make_script_with_merges(struct 
> pretty_print_context *pp,
>   entry = oidmap_get(, >object.oid);
>  
>   if (entry)
> - fprintf(out, "\n%c Branch %s\n", comment_line_char, 
> entry->string);
> + strbuf_addf(out, "\n%c Branch %s\n", comment_line_char, 
> entry->string);
>   else
> - fprintf(out, "\n");
> + strbuf_addf(out, "\n");

Please use plain strbuf_add() here.

Or strbuf_complete_line()?  Dunno, as seen in the previous hunk, 'out'
won't be empty at this point.



Re: What's cooking in git.git (Oct 2018, #01; Wed, 10)

2018-10-12 Thread Phillip Wood
On 11/10/2018 23:40, Junio C Hamano wrote:
> Phillip Wood  writes:
> 
>> On 10/10/2018 06:43, Junio C Hamano wrote:
>>> Here are the topics that have been cooking.  Commits prefixed with
>>> '-' are only in 'pu' (proposed updates) while commits prefixed with
>>> '+' are in 'next'.  The ones marked with '.' do not appear in any of
>>> the integration branches, but I am still holding onto them.
>>>
>>> * pw/diff-color-moved-ws-fix (2018-10-04) 5 commits
>>>   - diff --color-moved: fix a memory leak
>>>   - diff --color-moved-ws: fix another memory leak
>>>   - diff --color-moved-ws: fix a memory leak
>>>   - diff --color-moved-ws: fix out of bounds string access
>>>   - diff --color-moved-ws: fix double free crash
>>>
>>>   Various fixes to "diff --color-moved-ws".
>>>
>>>   What's the status of this topic?
>>
>> I think it is ready for next - Stefan was happy with the last iteration.
> 
> This is not about your fixes, but I was skimming the color-moved
> support in general as a final sanity check to move this forward and
> noticed that
> 
>   $ git diff --color-moved-ws=ignore-any master...
> 
> does not do anything interesting, which is broken at at least two
> points.
> 
>  * There is no "ignore-any" supported by the feature---I think that
>the parser for the option should have noticed and barfed, but it
>did not.  It merely emitted a message to the standard output and
>let it scroll away with the huge diff before the reader noticed
>it.

It would be nice if the parsing used starts_with(option_name, user_text)
rather than strcmp() as well. Also I think --color-moved=no is valid as
a synonym of --no-color-moved but --color-moved-ws=no is not supported.

>  * After fixing ignore-any to one of the supported option
>(e.g. "ignore-all-spaces"), the color-moved feature still did not
>trigger.  I think the presence of --color-moved-ws by itself is a
>hint that the user wants --color-moved to be used.  If it turns
>out that there are some valid use cases where --color-moved-ws
>may have to be set but the color-moved feature should not be
>enabled, then
> 
>   diff --color-moved-ws=ignore-all-space --no-color-moved
> 
>can be used to countermand this, of course.
> 
> Am I missing something or are these mere small sloppiness in the
> current code?
> 
> 
> 



Re: [PATCH 04/15] sequencer: refactor sequencer_add_exec_commands() to work on a todo_list

2018-10-12 Thread Phillip Wood
On 11/10/2018 17:57, Alban Gruin wrote:
> Hi Phillip,
> 
> thanks for taking the time to review my patches.
> 
> Le 11/10/2018 à 13:25, Phillip Wood a écrit :
>> On 07/10/2018 20:54, Alban Gruin wrote:
>>> @@ -4419,15 +4406,38 @@ int sequencer_add_exec_commands(const char
>>> *commands)
>>>   }
>>>     /* insert or append final  */
>>> -    if (insert >= 0 && insert < todo_list.nr)
>>> -    strbuf_insert(buf, todo_list.items[insert].offset_in_buf +
>>> +    if (insert >= 0 && insert < todo_list->nr)
>>> +    strbuf_insert(buf, todo_list->items[insert].offset_in_buf +
>>>     offset, commands, commands_len);
>>>   else if (insert >= 0 || !offset)
>>>   strbuf_add(buf, commands, commands_len);
>>>   -    i = write_message(buf->buf, buf->len, todo_file, 0);
>>> +    if (todo_list_parse_insn_buffer(buf->buf, todo_list))
>>> +    BUG("unusable todo list");}
>>
>> It is a shame to have to re-parse the todo list, I wonder how difficult
>> it would be to adjust the todo_list item array as the exec commands are
>> inserted. The same applies to the next couple of patches
>>
> 
> Good question.
> 
> This function inserts an `exec' command after every `pick' command.
> These commands are stored in a dynamically allocated list, grew with
> ALLOW_GROW().
> 
> If we want to keep the current structure, we would have to grow the size
> of the list by 1 and move several element to the end every time we want
> to add an `exec' command.  It would not be very effective.  Perhaps I
> should use a linked list here, instead.  It may also work well with
> rearrange_squash() and skip_unnecessary_picks().
> 
> Maybe we could even get rid of the strbuf at some point.

Another way would be to use the strbuf as a string pool rather than a
copy of the text of the file. There could be a write_todo_list()
function that takes a todo list and some flags, iterates over the items
in the todo list and writes the file. The flags would specify whether to
append the todo help and whether to abbreviate the object ids (so there
is no need for a separate call to transform_todos()). Then
add_exec_commands() could allocate a new array of todo items which it
builds up as it works through the original list and replaces the
original list with the new one at the end. The text of the exec items
can be added to the end of the strbuf (we only need one copy of the exec
text with this scheme). rearrange_squash() can use a temporary array to
build a new list as well or just memmove() things but that might be
slower if there is a lot of rearrangement to do. skip_unecessary_picks()
could just set the current item to the one we want to start with.

Best Wishes

Phillip

> 
>> Best Wishes
>>
>> Phillip
>>
> 
> Cheers,
> Alban
> 



Re: [PATCH] zlib.c: use size_t for size

2018-10-12 Thread Johannes Schindelin
Hi Junio,

On Fri, 12 Oct 2018, Junio C Hamano wrote:

> From: Martin Koegler 
> Date: Thu, 10 Aug 2017 20:13:08 +0200
> 
> Signed-off-by: Martin Koegler 
> Signed-off-by: Junio C Hamano 
> ---
> 
>  * I made minimal adjustments to make the change apply to today's
>codebase.  I still find some choices and mixing of off_t and
>size_t done by the patch a bit iffy, and some arith may need to
>become st_addX().  Extra set of eyes are very much appreciated.
> 
>  builtin/pack-objects.c | 10 +-
>  cache.h| 10 +-
>  pack-check.c   |  6 +++---
>  pack.h |  2 +-
>  packfile.h |  2 +-
>  wrapper.c  |  8 
>  zlib.c |  8 
>  7 files changed, 23 insertions(+), 23 deletions(-)
> 
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index e6316d294d..b9ca04eb8a 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -266,15 +266,15 @@ static void copy_pack_data(struct hashfile *f,
>   struct packed_git *p,
>   struct pack_window **w_curs,
>   off_t offset,
> - off_t len)
> + size_t len)
>  {
>   unsigned char *in;
> - unsigned long avail;
> + size_t avail;
>  
>   while (len) {
>   in = use_pack(p, w_curs, offset, );
>   if (avail > len)

Do we have to be careful to handle sizeof(off_t) != sizeof(size_t) here? I
guess we would receive a compiler warning about different sizes in that
case, but it makes me worry.

Not that it matters, as we currently have an *even worse* situation where
we *know* that `unsigned long` is 4 bytes on 64-bit Windows, and `off_t`
(the version we use, at least) is 8 bytes.

> - avail = (unsigned long)len;
> + avail = len;

Indentation change?

Apart from the consideration about different sizes of the data types
(which could maybe addressed via a comment in the commit message, after
thinking deeply about it), I think this patch is good to go.

Ciao,
Dscho

>   hashwrite(f, in, avail);
>   offset += avail;
>   len -= avail;
> @@ -1478,8 +1478,8 @@ static void check_object(struct object_entry *entry)
>   struct pack_window *w_curs = NULL;
>   const unsigned char *base_ref = NULL;
>   struct object_entry *base_entry;
> - unsigned long used, used_0;
> - unsigned long avail;
> + size_t used, used_0;
> + size_t avail;
>   off_t ofs;
>   unsigned char *buf, c;
>   enum object_type type;
> diff --git a/cache.h b/cache.h
> index d508f3d4f8..fce53fe620 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -20,10 +20,10 @@
>  #include 
>  typedef struct git_zstream {
>   z_stream z;
> - unsigned long avail_in;
> - unsigned long avail_out;
> - unsigned long total_in;
> - unsigned long total_out;
> + size_t avail_in;
> + size_t avail_out;
> + size_t total_in;
> + size_t total_out;
>   unsigned char *next_in;
>   unsigned char *next_out;
>  } git_zstream;
> @@ -40,7 +40,7 @@ void git_deflate_end(git_zstream *);
>  int git_deflate_abort(git_zstream *);
>  int git_deflate_end_gently(git_zstream *);
>  int git_deflate(git_zstream *, int flush);
> -unsigned long git_deflate_bound(git_zstream *, unsigned long);
> +size_t git_deflate_bound(git_zstream *, size_t);
>  
>  /* The length in bytes and in hex digits of an object name (SHA-1 value). */
>  #define GIT_SHA1_RAWSZ 20
> diff --git a/pack-check.c b/pack-check.c
> index fa5f0ff8fa..575e3e7125 100644
> --- a/pack-check.c
> +++ b/pack-check.c
> @@ -27,13 +27,13 @@ static int compare_entries(const void *e1, const void *e2)
>  }
>  
>  int check_pack_crc(struct packed_git *p, struct pack_window **w_curs,
> -off_t offset, off_t len, unsigned int nr)
> +off_t offset, size_t len, unsigned int nr)
>  {
>   const uint32_t *index_crc;
>   uint32_t data_crc = crc32(0, NULL, 0);
>  
>   do {
> - unsigned long avail;
> + size_t avail;
>   void *data = use_pack(p, w_curs, offset, );
>   if (avail > len)
>   avail = len;
> @@ -68,7 +68,7 @@ static int verify_packfile(struct packed_git *p,
>  
>   the_hash_algo->init_fn();
>   do {
> - unsigned long remaining;
> + size_t remaining;
>   unsigned char *in = use_pack(p, w_curs, offset, );
>   offset += remaining;
>   if (!pack_sig_ofs)
> diff --git a/pack.h b/pack.h
> index 34a9d458b4..1c9fecf929 100644
> --- a/pack.h
> +++ b/pack.h
> @@ -78,7 +78,7 @@ struct progress;
>  typedef int (*verify_fn)(const struct object_id *, enum object_type, 
> unsigned long, void*, int*);
>  
>  extern const char *write_idx_file(const char *index_name, struct 
> pack_idx_entry **objects, int nr_objects, 

Re: [PATCH] fixup! builtin rebase: support `--gpg-sign` option

2018-10-12 Thread Johannes Schindelin
Hi Junio,

On Thu, 11 Oct 2018, Junio C Hamano wrote:

> From: Johannes Schindelin 
> Date: Thu, 27 Sep 2018 14:48:17 +0200
> 
> The `--gpg-sign` option takes an *optional* argument, not a mandatory
> one.
> 
> This was discovered as part of the investigation of
> https://github.com/git-for-windows/git/issues/1836.
> 
> Signed-off-by: Johannes Schindelin 
> ---
> 
>  * I am sending this out as I want to mimize the number of
>non-trivial changes that come into my tree without hitting the
>list archive.

Thanks, *much* appreciated.

Ciao,
Dscho

> 
>  builtin/rebase.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index a28bfbd62f..43bc6f7915 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -1030,8 +1030,9 @@ int cmd_rebase(int argc, const char **argv, const char 
> *prefix)
>   OPT_BOOL(0, "autosquash", ,
>N_("move commits that begin with "
>   "squash!/fixup! under -i")),
> - OPT_STRING('S', "gpg-sign", _sign,
> -N_("gpg-sign?"), N_("GPG-sign commits")),
> + { OPTION_STRING, 'S', "gpg-sign", _sign, N_("key-id"),
> + N_("GPG-sign commits"),
> + PARSE_OPT_OPTARG, NULL, (intptr_t) "" },
>   OPT_STRING_LIST(0, "whitespace", ,
>   N_("whitespace"), N_("passed to 'git apply'")),
>   OPT_SET_INT('C', 0, _c, N_("passed to 'git apply'"),
> -- 
> 2.19.1-328-g5a0cc8aca7
> 
> 


Re: [PATCH] diff.c: pass sign_index to emit_line_ws_markup

2018-10-12 Thread Johannes Schindelin
Hi Stefan,

On Wed, 10 Oct 2018, Stefan Beller wrote:

> Instead of passing the sign directly to emit_line_ws_markup, pass only the
> index to lookup the sign in diff_options->output_indicators.
> 
> Signed-off-by: Stefan Beller 
> ---
> 
> I still have this patch laying around, it simplifies the diff code
> a tiny bit.

And I still like it (obviously, was my idea :-))

Thanks,
Dscho

> 
> Stefan
> 
>  diff.c | 12 +---
>  1 file changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/diff.c b/diff.c
> index f0c7557b40..9e895f2191 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -1202,10 +1202,11 @@ static void dim_moved_lines(struct diff_options *o)
>  static void emit_line_ws_markup(struct diff_options *o,
>   const char *set_sign, const char *set,
>   const char *reset,
> - char sign, const char *line, int len,
> + int sign_index, const char *line, int len,
>   unsigned ws_rule, int blank_at_eof)
>  {
>   const char *ws = NULL;
> + int sign = o->output_indicators[sign_index];
>  
>   if (o->ws_error_highlight & ws_rule) {
>   ws = diff_get_color_opt(o, DIFF_WHITESPACE);
> @@ -1285,8 +1286,7 @@ static void emit_diff_symbol_from_struct(struct 
> diff_options *o,
>   set = diff_get_color_opt(o, DIFF_FILE_OLD);
>   }
>   emit_line_ws_markup(o, set_sign, set, reset,
> - 
> o->output_indicators[OUTPUT_INDICATOR_CONTEXT],
> - line, len,
> + OUTPUT_INDICATOR_CONTEXT, line, len,
>   flags & (DIFF_SYMBOL_CONTENT_WS_MASK), 0);
>   break;
>   case DIFF_SYMBOL_PLUS:
> @@ -1330,8 +1330,7 @@ static void emit_diff_symbol_from_struct(struct 
> diff_options *o,
>   flags &= ~DIFF_SYMBOL_CONTENT_WS_MASK;
>   }
>   emit_line_ws_markup(o, set_sign, set, reset,
> - o->output_indicators[OUTPUT_INDICATOR_NEW],
> - line, len,
> + OUTPUT_INDICATOR_NEW, line, len,
>   flags & DIFF_SYMBOL_CONTENT_WS_MASK,
>   flags & DIFF_SYMBOL_CONTENT_BLANK_LINE_EOF);
>   break;
> @@ -1375,8 +1374,7 @@ static void emit_diff_symbol_from_struct(struct 
> diff_options *o,
>   set = diff_get_color_opt(o, DIFF_CONTEXT_DIM);
>   }
>   emit_line_ws_markup(o, set_sign, set, reset,
> - o->output_indicators[OUTPUT_INDICATOR_OLD],
> - line, len,
> + OUTPUT_INDICATOR_OLD, line, len,
>   flags & DIFF_SYMBOL_CONTENT_WS_MASK, 0);
>   break;
>   case DIFF_SYMBOL_WORDS_PORCELAIN:
> -- 
> 2.19.0
> 
> 


Re: [PATCH v2] range-diff: allow to diff files regardless submodule

2018-10-12 Thread Johannes Schindelin
Hi Lucas,

On Thu, 11 Oct 2018, Lucas De Marchi wrote:

> Do like it's done in grep so mode doesn't end up as
> 016, which means range-diff doesn't work if one has
> "submodule.diff = log" in the configuration. Without this
> while using range-diff I only get a
> 
> Submodule a 000...000 (new submodule)
> 
> instead of the diff between the revisions.
> 
> Signed-off-by: Lucas De Marchi 

Thank you for this contribution, which I am glad to ACK.

I am especially happy that you added a regression test so that we are
confident not to break this again.

Ciao,
Dscho

> ---
>  range-diff.c  |  2 +-
>  t/t3206-range-diff.sh | 29 +
>  2 files changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/range-diff.c b/range-diff.c
> index 60edb2f518..bd8083f2d1 100644
> --- a/range-diff.c
> +++ b/range-diff.c
> @@ -354,7 +354,7 @@ static struct diff_filespec *get_filespec(const char 
> *name, const char *p)
>  {
>   struct diff_filespec *spec = alloc_filespec(name);
>  
> - fill_filespec(spec, _oid, 0, 0644);
> + fill_filespec(spec, _oid, 0, 0100644);
>   spec->data = (char *)p;
>   spec->size = strlen(p);
>   spec->should_munmap = 0;
> diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
> index 045aca1c18..6aae364171 100755
> --- a/t/t3206-range-diff.sh
> +++ b/t/t3206-range-diff.sh
> @@ -122,6 +122,35 @@ test_expect_success 'changed commit' '
>   test_cmp expected actual
>  '
>  
> +test_expect_success 'changed commit with sm config' '
> + git range-diff --no-color --submodule=log topic...changed >actual &&
> + cat >expected <<-EOF &&
> + 1:  4de457d = 1:  a4b s/5/A/
> + 2:  fccce22 = 2:  f51d370 s/4/A/
> + 3:  147e64e ! 3:  0559556 s/11/B/
> + @@ -10,7 +10,7 @@
> +   9
> +   10
> +  -11
> + -+B
> + ++BB
> +   12
> +   13
> +   14
> + 4:  a63e992 ! 4:  d966c5c s/12/B/
> + @@ -8,7 +8,7 @@
> +  @@
> +   9
> +   10
> + - B
> + + BB
> +  -12
> +  +B
> +   13
> + EOF
> + test_cmp expected actual
> +'
> +
>  test_expect_success 'no commits on one side' '
>   git commit --amend -m "new message" &&
>   git range-diff master HEAD@{1} HEAD
> -- 
> 2.19.1.1.g8c3cf03f71
> 
> 


Re: [PATCH] range-diff: allow to diff files regardless submodule

2018-10-12 Thread Johannes Schindelin



On Thu, 11 Oct 2018, Lucas De Marchi wrote:

> On Wed, Oct 10, 2018 at 5:02 PM brian m. carlson
>  wrote:
> >
> > On Wed, Oct 10, 2018 at 08:09:16AM -0700, Lucas De Marchi wrote:
> > > Do like it's done in grep so mode doesn't end up as
> > > 016, which means range-diff doesn't work if one has
> > > "submodule.diff = log" in the configuration. Without this
> > > while using range-diff I only get a
> > >
> > > Submodule a 000...000 (new submodule)
> > >
> > > instead of the diff between the revisions.
> > >
> > > Signed-off-by: Lucas De Marchi 
> > > ---
> > >  range-diff.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/range-diff.c b/range-diff.c
> > > index 60edb2f518..bd8083f2d1 100644
> > > --- a/range-diff.c
> > > +++ b/range-diff.c
> > > @@ -354,7 +354,7 @@ static struct diff_filespec *get_filespec(const char 
> > > *name, const char *p)
> > >  {
> > >   struct diff_filespec *spec = alloc_filespec(name);
> > >
> > > - fill_filespec(spec, _oid, 0, 0644);
> > > + fill_filespec(spec, _oid, 0, 0100644);
> >
> > If we have a system that has different mode values from the common Unix
> > ones, is this still correct or does it need to change?
> 
> From what I can see this would still be correct, or at least git-grep
> implementation would be broken.

As you can see from the Windows port: we are stuck with the simplistic
POSIX permissions in Git, and platforms that have a different permission
system have to emulate it.

We only use preciously few bit masks, anyway. For example, we do not
really use the lower 12 bits for anything but "executable or not?"

So Lucas' patch is correct, AFAICT.

Ciao,
Dscho

> 
> Lucas De Marchi
> > --
> > brian m. carlson: Houston, Texas, US
> > OpenPGP: https://keybase.io/bk2204
> 
> 
> 
> -- 
> Lucas De Marchi
> 


Re: `--rebase-merges' still failing badly

2018-10-12 Thread Johannes Schindelin
Hi Michael,

On Wed, 10 Oct 2018, Michael Witten wrote:

> In my opinion, the `--rebase-merges' feature has been broken since the
> beginning, and the builtin version should  be fixed before it is moved
> ahead.

Everybody is entitled to an opinion. My opinion differs from yours, and I
am a heavy user of `git rebase -kir`.

The `--rebase-merges` feature is not without problems, of course. I can
name a couple of bugs, but I have a hunch that it is more efficient for me
to just fix them.

> In short: "labels" are brittle; see below for tests.

Sure, let's improve them.

> Also, here are some quick *additional* thoughts:
> 
> * Labels should be simply "r0", "r1", ... "rN".

That would not be an improvement.

The *interactive* version of `--rebase-merges` is what I use extensively
to juggle Git for Windows' branch thicket. It would be really bad if I had
to somehow map those label names in my head, rather than having the
intuitively-understood labels.

I would understand if you suggested to try to come up with a better naming
than `branch-point-`. But `r`? That's worse than the current state.
By a lot.

> * Why is the command "label" and not "branch"?

Because it is more versatile than just a branch. It is also branch points.
As a matter of fact, the very first statement is about the `onto` label,
which is not a branch.

> * In my experience, there's a lot of this boiler plate:
> 
>   pick 12345
>   label r1
>   reset r0
>   merge r1
> 
>   How about instead, use git's existing ideas:
> 
>   pick 12345
>   reset r0
>   merge ORIG_HEAD

Too magic. And you cannot change it easily. I had this very real example,
a couple of times yesterday: A merge was in one of the "branches", and
needed to be moved out of it:

pick abc
label branch-point
merge -C 0123 topic
pick def
label bug-fix

reset branch-point
merge -C 4567 bug-fix

This `merge -C 0123 topic` needed to be moved before the branch point.

Another example where the explicit labeling comes in *real* handy is when
I made a Pull Request in Git for Windows ready for contribution to core
Git. These Pull Requests are normally based on `master`, because that is
what the best PR flow is: you based your contributions as close to the tip
as possible, to avoid merge conflicts (and to test as close to the real,
after-merge thing). This would look like this:

label branch-point
pick 123
pick 456
label pr-0815

reset branch-point
merge -C abc pr-0815

Now, to prepare this for core Git, I have to graft this PR onto the
`master` of *upstream*, in our case I would use the `onto` label for that,
by inserting a `reset onto` just before `pick 123`.

So you see, the current, non-implicit, but very much explicit syntax,
makes all of these tasks *quite* easy, and more importantly,
straight-forward: I did not have to explain this to anyone who I needed to
teach how this works.

Remember: the syntax of the todo list is not optimized to be short. It is
optimized to be *editable*. I need to have a very easy way to juggle
criss-cross-merging branch thickets. And the current syntax, while
chattier than you would like, does the job. Pretty well, even.

> * Why not just `--merges' instead of `--rebase-merges'?

This ship has sailed. It is pointless to discuss this now.

Besides, I believe that in your quest to shorten things, you unfortunately
shortened things too much: it is no longer clear what "merges" means in
the context of `--merges`.

> Unfortunately,   both  the   legacy   version  and   the  rewrite   of
> `--rebase-merges'  display  a  bug  that  makes  this  feature  fairly
> unusable in  practice;

You will be surprised just how much I would embrace bug fixes, once you
provide any.

> it tries  to create  a "label" (i.e.,  a branch name) from a commit log
> summary  line, and the result is often invalid (or just  plain
> irritating to work  with). In particular, it  fails on typical
> characters, including at least these:
> 
> :/\?.*[]

And of course those are not the only ones. The trick is to reduce runs of
disallowed characters to dashes, as is already done with spaces.

Ciao,
Johannes

> 
> To see this, first define some POSIX shell functions:
> 
> test()
> {
> (
> set -e
> summary=$1
> d=/tmp/repo # WARNING. CHANGE IF NECESSARY.
> rm -rf "$d"; mkdir -p "$d"; cd "$d"
> git init -q
> echo a > a; git add a; git commit -q -m a
> git branch base
> echo b > b; git add b; git commit -q -m b
> git reset -q --hard HEAD^
> git merge -q --no-ff -m "$summary" ORIG_HEAD
> git log --graph --oneline
> git rebase --rebase-merges base
> ); status=$?
> echo
> return "$status"
> }
> 
> Test()
> {
> if test 

Re: [PATCH] [Outreachy] git/userdiff.c fix regex pattern error

2018-10-12 Thread Junio C Hamano
Johannes Schindelin  writes:

> On Sun, 7 Oct 2018, Junio C Hamano wrote:
>
>> And then there is an unnamed misdesigned language that has a
>> regmatch() function, which takes a string that contains a regular
>> expression, but somehow requires that string to begin and end with a
>> slash for no justifiable reason ;-).
>
> It makes more sense once you realize that // is a very nice
> syntactic construct to obtain an instance of a regular expression object,
> rather than a string describing one.

In Perl, qr// is often how you use that syntactic construct.

The unnamed misdesigned language literally uses a string whose
content is //; iow, 

preg_match('/a.*b/', ...)

is what you see, and there is no "regex object involved.  As far as
the function is concerned, it is receiving a string, that represents
a regexp.  There is no reason to require the pattern _string_ to
begin and end with slash, but somehow they do.

> But in C, we do not have objects, so the way to describe a regular
> expression is a string (which you have to compile into an opaque data
> structure before applying it).

Absolutely.  C does not have regexp object literal and the literal
you write to represent a regexp is just a string that you would pass
to functions like regcomp().  But at least C is not so brain-dead to
require slashes on either end of that string ;-)


Re: [PATCH v2 2/2] rebase -i: introduce the 'break' command

2018-10-12 Thread Johannes Schindelin
Hi Phillip,

On Thu, 11 Oct 2018, Phillip Wood wrote:

> I think this would be a useful addition to rebase, there's one small
> comment below.
> 
> On 10/10/2018 09:53, Johannes Schindelin via GitGitGadget wrote:
> > From: Johannes Schindelin 
> > 
> > The 'edit' command can be used to cherry-pick a commit and then
> > immediately drop out of the interactive rebase, with exit code 0, to let
> > the user amend the commit, or test it, or look around.
> > 
> > Sometimes this functionality would come in handy *without*
> > cherry-picking a commit, e.g. to interrupt the interactive rebase even
> > before cherry-picking a commit, or immediately after an 'exec' or a
> > 'merge'.
> > 
> > This commit introduces that functionality, as the spanking new 'break'
> > command.
> > 
> > Suggested-by: Stefan Beller 
> > Signed-off-by: Johannes Schindelin 
> > ---
> >  Documentation/git-rebase.txt | 3 +++
> >  rebase-interactive.c | 1 +
> >  sequencer.c  | 7 ++-
> >  t/lib-rebase.sh  | 2 +-
> >  t/t3418-rebase-continue.sh   | 9 +
> >  5 files changed, 20 insertions(+), 2 deletions(-)
> > 
> > diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
> > index db2faca73c..5bed1da36b 100644
> > --- a/Documentation/git-rebase.txt
> > +++ b/Documentation/git-rebase.txt
> > @@ -561,6 +561,9 @@ By replacing the command "pick" with the command 
> > "edit", you can tell
> >  the files and/or the commit message, amend the commit, and continue
> >  rebasing.
> >  
> > +To interrupt the rebase (just like an "edit" command would do, but without
> > +cherry-picking any commit first), use the "break" command.
> > +
> >  If you just want to edit the commit message for a commit, replace the
> >  command "pick" with the command "reword".
> >  
> > diff --git a/rebase-interactive.c b/rebase-interactive.c
> > index 0f4119cbae..78f3263fc1 100644
> > --- a/rebase-interactive.c
> > +++ b/rebase-interactive.c
> > @@ -14,6 +14,7 @@ void append_todo_help(unsigned edit_todo, unsigned 
> > keep_empty,
> >  "s, squash  = use commit, but meld into previous commit\n"
> >  "f, fixup  = like \"squash\", but discard this commit's log 
> > message\n"
> >  "x, exec  = run command (the rest of the line) using shell\n"
> > +"b, break = stop here (continue rebase later with 'git rebase 
> > --continue')\n"
> >  "d, drop  = remove commit\n"
> >  "l, label  = label current HEAD with a name\n"
> >  "t, reset  = reset HEAD to a label\n"
> > diff --git a/sequencer.c b/sequencer.c
> > index 8dd6db5a01..b209f8af46 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -1416,6 +1416,7 @@ enum todo_command {
> > TODO_SQUASH,
> > /* commands that do something else than handling a single commit */
> > TODO_EXEC,
> > +   TODO_BREAK,
> > TODO_LABEL,
> > TODO_RESET,
> > TODO_MERGE,
> > @@ -1437,6 +1438,7 @@ static struct {
> > { 'f', "fixup" },
> > { 's', "squash" },
> > { 'x', "exec" },
> > +   { 'b', "break" },
> > { 'l', "label" },
> > { 't', "reset" },
> > { 'm', "merge" },
> > @@ -1964,7 +1966,7 @@ static int parse_insn_line(struct todo_item *item, 
> > const char *bol, char *eol)
> > padding = strspn(bol, " \t");
> > bol += padding;
> >  
> > -   if (item->command == TODO_NOOP) {
> > +   if (item->command == TODO_NOOP || item->command == TODO_BREAK) {
> > if (bol != eol)
> > return error(_("%s does not accept arguments: '%s'"),
> >  command_to_string(item->command), bol);
> > @@ -3293,6 +3295,9 @@ static int pick_commits(struct todo_list *todo_list, 
> > struct replay_opts *opts)
> > unlink(rebase_path_stopped_sha());
> > unlink(rebase_path_amend());
> > delete_ref(NULL, "REBASE_HEAD", NULL, REF_NO_DEREF);
> > +
> > +   if (item->command == TODO_BREAK)
> > +   break;
> 
> Normally when rebase stops it prints a message to say where it has got
> to and how to continue, it might be useful to do the same here

That's a very valid point. I'll think of something.

Ciao,
Dscho

> 
> Best Wishes
> 
> Phillip
> 
> > }
> > if (item->command <= TODO_SQUASH) {
> > if (is_rebase_i(opts))
> > diff --git a/t/lib-rebase.sh b/t/lib-rebase.sh
> > index 25a77ee5cb..584604ee63 100644
> > --- a/t/lib-rebase.sh
> > +++ b/t/lib-rebase.sh
> > @@ -49,7 +49,7 @@ set_fake_editor () {
> > case $line in
> > squash|fixup|edit|reword|drop)
> > action="$line";;
> > -   exec*)
> > +   exec*|break)
> > echo "$line" | sed 's/_/ /g' >> "$1";;
> > "#")
> > echo '# comment' >> "$1";;
> > diff --git a/t/t3418-rebase-continue.sh b/t/t3418-rebase-continue.sh
> > index c145dbac38..185a491089 100755
> > --- a/t/t3418-rebase-continue.sh
> > +++ b/t/t3418-rebase-continue.sh
> > 

Re: [PATCH v2 1/2] rebase -i: clarify what happens on a failed `exec`

2018-10-12 Thread Johannes Schindelin
Hi Junio & Eric,

On Thu, 11 Oct 2018, Junio C Hamano wrote:

> Eric Sunshine  writes:
> 
> > On Wed, Oct 10, 2018 at 4:54 AM Johannes Schindelin via GitGitGadget
> >  wrote:
> >> We had not documented previously what happens when an `exec` command in
> >> an interactive rebase fails. Now we do.
> >>
> >> Signed-off-by: Johannes Schindelin 
> >> ---
> >> diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
> >> @@ -420,7 +420,8 @@ idea unless you know what you are doing (see BUGS 
> >> below).
> >>  --exec ::
> >> Append "exec " after each line creating a commit in the
> >> final history.  will be interpreted as one or more shell
> >> -   commands.
> >> +   commands. Anz command that fails will interrupt the rebase,
> >> +   withe exit code 1.
> >
> > s/Anz/Any/
> > s/withe/with/

These tyopes will be fxied in the nxet itaretion.

Ciao,
Dhsco

> 
> Heh, I know I am not good at spelling, either, but hopefully I am a
> bit more careful than leaving as many typoes as I have added lines
> in my patch X-<.  After all, it's not a race to send in patches as
> quickly as possible.
> 
> Queued.  Thanks, both.
> 


Re: Diff Range Underflow

2018-10-12 Thread Johannes Schindelin
Hi Era,

On Sun, 7 Oct 2018, Era wrote:

> I discovered an apparent underflow when using the —unified=0 / -U0 flag with 
> git-show on a merge commit.
> Leaving the flag on its default value or 1 shows the ranges correctly.
> 
>   $git --no-pager show -z --no-color --pretty=%x00%s%x00%b%x00 -U1 
> 3ac9cfed9ce01836dea1fee96c4eceb2df4b6878
> 
> produces a diff with the following ranges
> 
>   @@@ -582,2 -599,19 +582,1 @@@ extension TranslationContentViewControl
> 
> changing the flag to -U0 like so
> 
>   git --no-pager show -z --no-color --pretty=%x00%s%x00%b%x00 -U0 
> 3ac9cfed9ce01836dea1fee96c4eceb2df4b6878
> 
> results in the following ranges begin output
> 
>   @@@ -583,0 -600,17 +583,18446744073709551615 @@@ extension 
> TranslationContentViewControl
> 
> 
> Obviously this is some sort of underflow bug.
> Unfortunately I cant share the original repo or diff.

Happily, you can work on reproducing this with a public repository, or
even better: with two crafted files that can be compared via `git diff
--no-index ...` to trigger the bug, starting with the blob contents you
cannot share and editing them down to the minimal size, changing the
contents to no longer contain any sensitive information. Please do so.

Ciao,
Johannes

> 
> 
> Best regards,
> Erik
> 
> 
> 
> 
> 
> 
> 

Re: [PATCH 1/1] subtree: add build targets 'man' and 'html'

2018-10-12 Thread Johannes Schindelin


On Wed, 10 Oct 2018, Junio C Hamano wrote:

> Christian Hesse  writes:
> 
> > From: Christian Hesse 
> >
> > We have targets 'install-man' and 'install-html', let's add build
> > targets as well.
> >   ...
> > +man: $(GIT_SUBTREE_DOC)
> > +
> > +html: $(GIT_SUBTREE_HTML)
> > +
> 
> As 'contrib' material without real maintenance, I do not care too
> deeply, but shouldn't this change be more like this to avoid
> duplicating the list of targets?

Ævar mentioned that he'd like this to graduate from contrib into core, and
I had briefly looked at making subtree a built-in to that end. IOW please
do not get too used to subtree being in contrib.

Ciao,
Dscho

> 
> 
> diff --git a/contrib/subtree/Makefile b/contrib/subtree/Makefile
> index 5c6cc4ab2c..4a10a020a0 100644
> --- a/contrib/subtree/Makefile
> +++ b/contrib/subtree/Makefile
> @@ -59,17 +59,21 @@ $(GIT_SUBTREE): $(GIT_SUBTREE_SH)
>  
>  doc: $(GIT_SUBTREE_DOC) $(GIT_SUBTREE_HTML)
>  
> +man: $(GIT_SUBTREE_DOC)
> +
> +html: $(GIT_SUBTREE_HTML)
> +
>  install: $(GIT_SUBTREE)
>   $(INSTALL) -d -m 755 $(DESTDIR)$(gitexecdir)
>   $(INSTALL) -m 755 $(GIT_SUBTREE) $(DESTDIR)$(gitexecdir)
>  
>  install-doc: install-man install-html
>  
> -install-man: $(GIT_SUBTREE_DOC)
> +install-man: man
>   $(INSTALL) -d -m 755 $(DESTDIR)$(man1dir)
>   $(INSTALL) -m 644 $^ $(DESTDIR)$(man1dir)
>  
> -install-html: $(GIT_SUBTREE_HTML)
> +install-html: html
>   $(INSTALL) -d -m 755 $(DESTDIR)$(htmldir)
>   $(INSTALL) -m 644 $^ $(DESTDIR)$(htmldir)
>  
> @@ -94,4 +98,4 @@ clean:
>   $(RM) $(GIT_SUBTREE)
>   $(RM) *.xml *.html *.1
>  
> -.PHONY: FORCE
> +.PHONY: FORCE man html install-man install-html
> 
> 

  1   2   >