Re: [PATCH 06/12] log: add -P as a synonym for --perl-regexp

2017-04-09 Thread Junio C Hamano
On Sat, Apr 8, 2017 at 10:25 PM, Ævar Arnfjörð Bjarmason
 wrote:
> Add a short -P option as a synonym for the longer --perl-regexp, for
> consistency with the options the corresponding grep invocations
> accept.
>
> This was intentionally omitted in commit 727b6fc3ed ("log --grep:
> accept --basic-regexp and --perl-regexp", 2012-10-03) for unspecified
> future use.
>
> Since nothing has come along in over 4 1/2 years that's wanted to use
> it, it's more valuable to make it consistent with "grep" than to keep
> it open for future use, and to avoid the confusion of -P meaning
> different things for grep & log, as is the case with the -G option.

I initially had a strong reaction to the above "4 1/2 years entitles us to
do anything that might inconvenience future developers" reasoning, but
as long as we already have -E for extended regexp usable, even if we
will never be able to use -G for basic regexp, I am OK with giving -P for
pcre to be consistent with "git grep". I'd be even supportive if others
agree with this change.

Thanks.


Re: [PATCH 2/2] test-lib: exhaustively insert non-alnum ASCII into the TRASH_DIRECTORY name

2017-04-09 Thread SZEDER Gábor
> Change the test library to insert non-alphanumeric ASCII characters
> into the TRASH_DIRECTORY name, that's the directory the test library
> creates, chdirs to and runs each individual test from.
> 
> Unless test_fails_on_unusual_directory_names=1 is declared before
> importing test-lib.sh (and if perl isn't available on the system), the
> trash directory will contain every non-alphanumeric character in
> ASCII, in order.

At the very least there must be an easier way to disable this, e.g. a
command line option.

This change is sure effective in smoking out bugs, but it's a major
annoyance during development when it comes to debugging a test.  At
first I could not even cd into the trash directory, because TAB
completing the directory name with all those non-printable characters
didn't work (this may be a bug in the bash-completion package).  And
simply copy-pasting the dirname didn't work either, because 'ls'
displayed it as

  trash directory.t9902-completion.??? 
!"#$%&'()*+,-:;<=>?@[\]^_`{|}~?

After some headscratching, Sunday night may be my excuse, I figured
out that 'cd tr*' works...  only to be greeted with the ugliest-ever
three-line(!) shell prompt.

Therefore I would say that this should not even be enabled by default
in test-lib.sh, so at least running a test directly from the command
line as ./t1234-foo.sh would considerately give us an easily
accessible trash directory even without any command line options.  We
could enable it for 'make test' by default via GIT_TEST_OPTS in
t/Makefile, though.


> This includes all the control characters, !, [], {} etc. the "."
> character isn't included because it's already in the directory name,
> and nor is "/" for obvious reasons, although that would actually work,
> we'd just create a subdirectory, which would make the tests harder to
> inspect when they fail.i

1. Heh.  How an additional subdirectory would make the tests harder to
   inspect is nothing compared to the effect of all the other
   characters.

2. s/i$//

> This change is inspired by the "submodule: prevent backslash expantion
> in submodule names" patch[1]. If we'd had backslashes in the
> TRASH_DIRECTORY all along that bug would have been fixed a long time
> ago. This will flag such issues by marking tests that currently fail
> with "test_fails_on_unusual_directory_names=1", ensure that new tests
> aren't added unless a discussion is had about why the code can't
> handle unusual pathnames, and prevent future regressions.
> 
> 1. <20170407172306.172673-1-bmw...@google.com>
> ---
>  t/README  | 12 
>  t/test-lib.sh |  4 
>  2 files changed, 16 insertions(+)
> 
> diff --git a/t/README b/t/README
> index ab386c3681..314dd40221 100644
> --- a/t/README
> +++ b/t/README
> @@ -345,6 +345,18 @@ assignment to variable 'test_description', like this:
>   This test registers the following structure in the cache
>   and tries to run git-ls-files with option --frotz.'
>  
> +By default the tests will be run from a directory with a highly
> +unusual filename that includes control characters, a newline, various
> +punctuation etc., this is done to smoke out any bugs related to path
> +handling. If for whatever reason the tests can't deal with such
> +unusual path names, set:
> +
> +test_fails_on_unusual_directory_names=1
> +
> +Before sourcing 'test-lib.sh' as described below. This option is
> +mainly intended to grandfather in existing broken tests & code, and
> +should usually not be used in new code, instead your tests or code
> +probably need fixing.
> 
>  Source 'test-lib.sh'
>  
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 13b5696822..089ff5ac7d 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -914,6 +914,10 @@ fi
>  
>  # Test repository
>  TRASH_DIRECTORY="trash directory.$(basename "$0" .sh)"
> +if test -z "$test_fails_on_unusual_directory_names" -a "$(perl -e 'print 
> 1+1' 2>/dev/null)" = "2"
> +then
> +   TRASH_DIRECTORY="$TRASH_DIRECTORY.$(perl -e 'print join q[], grep { 
> /[^[:alnum:]]/ and !m<[./]> } map chr, 0x01..0x7f')"
> +fi
>  test -n "$root" && TRASH_DIRECTORY="$root/$TRASH_DIRECTORY"
>  case "$TRASH_DIRECTORY" in
>  /*) ;; # absolute path is good
> -- 
> 2.11.0




Re: [PATCH] Make git log work for git CWD outside of work tree

2017-04-09 Thread Junio C Hamano
Danny Sauer  writes:

> That's feedback I was looking for - what's the "right" way to get the
> path to the file.

My knee-jerk reaction is that you SHOULDN'T have to do anything
special to "get the path".

I do not have the code in front of me (this is my "down" week after
all), so let me give the general outline of how to reason about
this, with the knowledge of code evolution of the system.

In the beginning, Git worked only from the top level of the working
tree.  If you have your project in ~/myproject with a file "f" in a
directory "d", that file is ~/myproject/d/f", and you would do

$ cd ~/myproject
$ git add d/f

to add it.  If Git wanted to access the index, it just accessed
".git/index" as a relative path.  If Git wanted to access a file at
the top of the working tree, say ".gitignore", it just accessed
".gitignore" as a relatlvei path.  Because it only worked from the
top.

Then we added a support for running Git from a subdirectory, so that
you can say "cd ~/myproject/d && git add f".  In order to keep the
existing code that wants to access the index as ".git/index" and
wants to access the top-level ".gitignore" as ".gitignore" working,
the support to run Git from a subdirectory is designed this way:

- Each main program of subcommand (e.g. cmd_log()) receives a
  parameter "prefix", which tells what subdirectory you were in
  when you started Git.

- Before running the main program of subcommand, Git chdir(2)s
  up to the top level of the working tree.

- The main program of subcommand receives the command line from
  the user intact.  It is responsible for prepending the prefix
  to the path the user gave it from the command line.

So if you did "cd ~/myproject/d && git add f", Git goes up to
"~/myproject", passes argv=("f", NULL) and prefix="d/" to cmd_add().

Adding to the index wants to read and update the index, which is
still done by opening ".git/index" (relative to the toplevel as
before), and inspecting the top-level .gitigore file is done by
opening ".gitignore" (relative to the toplevel as before).  And
the cmd_add() forms the path "d/f" by using the prefix "d/" and the
user-supplied pathname "f".

When we first added a support for having the (equivalent of) ".git/"
directory outside the working tree by setting GIT_DIR environment
variable, again, you can only use Git from the top-level of the
working tree.  Instead of "~/myproject/.git", you can keep your
repository metainformation in say "~/mypro.git/" and point at it
with GIT_DIR environment variable, and said

$ export GIT_DIR=~/mypro.git
$ cd ~/myproject
$ git add d/f

Later we also added GIT_WORK_TREE environment variable to be used
together with GIT_DIR so that you can start from ~/myproject/d, very
similarly to how you worked from a subdirectory without these
environment variables, i.e.

$ export GIT_DIR=~/mypro.git GIT_WORK_TREE=~/myproject
$ cd ~/myproject/d
$ git add f

The way this support was added was to go to the top-level of the
working tree (i.e. "~/myproject" in this example) and passing the
prefix (i.e. "d" in this example).

Notice that in all of the above configurations, if a Git command
knows a path to something that is relative to the top of the working
tree (e.g. ".git/index" in the ancient Git, ".gitignore" at the
top-level that governs the entire working tree, or ".mailmap"), it
should just be able to open that path without asking "where is the
top of the working tree?".

So if your directory arrangement is a variation of these basic
configurations supported, e.g. if your git directory is
~/myproject/.git and your working tree is ~/myproject, and you use
the GIT_DIR and GIT_WORK_TREE to point at them, regardless of which
subdirectory of $GIT_WORK_TREE you started with, Git should already
have done chdir(2) to ~/myproject/ before it starts cmd_log(), and
it should be able to just open and read ".mailmap" (of course, it
needs to limit itself to do so only when it is working with a
repository with a working tree).

If your arrangement is even more exotic, e.g. you have these two
variables set, and then are running from OUTSIDE the working tree,
my knee-jerk reaction is that you should get your head examined, as
it is totally unclear what "git add foo" would mean if you did this:

$ export GIT_DIR=~/myproject/.git GIT_WORK_TREE=~/myproject
$ cd ~/myproject/../somewhere/else
$ git add foo

But it should still "work" in the sense that the above command
should notice that you are trying to add "../somewhere/else/foo" to
the index, which is obviously nonsense, and die with a message.

Similarly, if you replace "git add foo" with "git log", it still
should work in the above, i.e.

$ export GIT_DIR=~/myproject/.git GIT_WORK_TREE=~/myproject
$ cd ~/myproject/../somewhere/else
$ git log

If Git is not chdir(2)ing to ~/myproject before calling cmd_log()
in the above (again, this is my down week so I didn't and 

Re: [PATCH 0/2] test: Detect *lots* of bugs by adding non-alnum to trash dir names

2017-04-09 Thread Joachim Durchholz

Am 09.04.2017 um 21:11 schrieb Ævar Arnfjörð Bjarmason:

This series changes the test library so that we interpolate the result
of:

perl -e 'print join q[], grep { /[^[:alnum:]]/ and !m<[./]> } map chr, 
0x01..0x7f'

Into the trash directory name we generate. Doing this makes 30% of the
test suite fail.


Wow.
I know that I tend to bring out weird errors in systems, but this is 
much deeper than I could have ever expected.


You might want to additionally test directory names like "asdf" and 
'asdf' - some shell might try to be smart about such names and 
automatically unquote them. This would probably mean running the test 
suite multiple times, which some people find offensive (myself included).


There's also high-bit characters that might cause oddness. 0x80 and 0xff 
look like interesting edge cases to me.
Note that just appending these bytes is going to cause failures with 
filesystems that expect UTF-8 file names, you'd want to use a Unicode 
code point that gets encoded using a 0x80 resp. 0xff byte (I suspect 
0xff is actually not a valid UTF-8 encoding at all but I'd have to 
double-check that).


Is git supposed to run fine under Windows, or under Linux with an ntfs 
filesystem driver?
If yes, then I'd expect this patch to cause the other 70% of tests to 
crash, because NTFS forbids several non-alnum printable characters, 
among them <=>\


[PATCH] grep: plug a trivial memory leak

2017-04-09 Thread Ævar Arnfjörð Bjarmason
Change the cleanup phase for the grep command to free the pathspec
struct that's allocated earlier in the same block, and used just a few
lines earlier.

With "grep hi README.md" valgrind reports a loss of 239 bytes now,
down from 351.

The relevant --num-callers=40 --leak-check=full --show-leak-kinds=all
backtrace is:

[...] 187 (112 direct, 75 indirect) bytes in 1 blocks are definitely lost 
in loss record 70 of 110
[...]at 0x4C2BBAF: malloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
[...]by 0x60B339: do_xmalloc (wrapper.c:59)
[...]by 0x60B2F6: xmalloc (wrapper.c:86)
[...]by 0x576B37: parse_pathspec (pathspec.c:652)
[...]by 0x4519F0: cmd_grep (grep.c:1215)
[...]by 0x4062EF: run_builtin (git.c:371)
[...]by 0x40544D: handle_builtin (git.c:572)
[...]by 0x4060A2: run_argv (git.c:624)
[...]by 0x4051C6: cmd_main (git.c:701)
[...]by 0x4C5901: main (common-main.c:43)

Signed-off-by: Ævar Arnfjörð Bjarmason 
---

Since pretty much any non-trivial git command leaks because it skips
cleanup are patches in this category even accepted? Worth a try...

 builtin/grep.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/builtin/grep.c b/builtin/grep.c
index 65070c52fc..3ffb5b4e81 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1299,6 +1299,7 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
hit |= wait_all();
if (hit && show_in_pager)
run_pager(, prefix);
+   clear_pathspec();
free_grep_patterns();
return !hit;
 }
-- 
2.11.0



Re: [PATCH v5 3/8] convert: Split start_multi_file_filter into two separate functions

2017-04-09 Thread Lars Schneider

> On 07 Apr 2017, at 14:03, Ben Peart  wrote:
> 
> To enable future reuse of the filter..process infrastructure,
> split start_multi_file_filter into two separate parts.
> 
> start_multi_file_filter will now only contain the generic logic to
> manage the creation and tracking of the child process in a hashmap.
> 
> start_multi_file_filter_fn is a protocol specific initialization
> function that will negotiate the multi-file-filter interface version
> and capabilities.
> 
> Signed-off-by: Ben Peart 
> ---
> convert.c | 63 ++-
> 1 file changed, 38 insertions(+), 25 deletions(-)
> 
> diff --git a/convert.c b/convert.c
> index 793c29ebfd..404757eac9 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -565,35 +565,14 @@ static void stop_multi_file_filter(struct child_process 
> *process)
>   finish_command(process);
> }
> 
> -static struct cmd2process *start_multi_file_filter(struct hashmap *hashmap, 
> const char *cmd)
> +static int start_multi_file_filter_fn(struct cmd2process *entry)
> {
>   int err;
> - struct cmd2process *entry;
> - struct child_process *process;
> - const char *argv[] = { cmd, NULL };
>   struct string_list cap_list = STRING_LIST_INIT_NODUP;
>   char *cap_buf;
>   const char *cap_name;
> -
> - entry = xmalloc(sizeof(*entry));
> - entry->cmd = cmd;
> - entry->supported_capabilities = 0;
> - process = >process;
> -
> - child_process_init(process);
> - process->argv = argv;
> - process->use_shell = 1;
> - process->in = -1;
> - process->out = -1;
> - process->clean_on_exit = 1;
> - process->clean_on_exit_handler = stop_multi_file_filter;
> -
> - if (start_command(process)) {
> - error("cannot fork to run external filter '%s'", cmd);
> - return NULL;
> - }
> -
> - hashmap_entry_init(entry, strhash(cmd));
> + struct child_process *process = >process;
> + const char *cmd = entry->cmd;
> 
>   sigchain_push(SIGPIPE, SIG_IGN);
> 
> @@ -642,7 +621,41 @@ static struct cmd2process 
> *start_multi_file_filter(struct hashmap *hashmap, cons
> done:
>   sigchain_pop(SIGPIPE);
> 
> - if (err || errno == EPIPE) {
> + if (err || errno == EPIPE)
> + err = err ? err : errno;

Nice! I should have done this here, too:
https://github.com/git/git/blob/b14f27f91770e0f99f64135348977a0ce1c7993a/convert.c#L755

This is clearly a bug in my code. I'll send a patch shortly.


> +
> + return err;
> +}
> +
> +static struct cmd2process *start_multi_file_filter(struct hashmap *hashmap, 
> const char *cmd)
> +{
> + int err;
> + struct cmd2process *entry;
> + struct child_process *process;
> + const char *argv[] = { cmd, NULL };
> +
> + entry = xmalloc(sizeof(*entry));
> + entry->cmd = cmd;
> + entry->supported_capabilities = 0;
> + process = >process;
> +
> + child_process_init(process);
> + process->argv = argv;
> + process->use_shell = 1;
> + process->in = -1;
> + process->out = -1;
> + process->clean_on_exit = 1;
> + process->clean_on_exit_handler = stop_multi_file_filter;
> +
> + if (start_command(process)) {
> + error("cannot fork to run external filter '%s'", cmd);
> + return NULL;
> + }
> +
> + hashmap_entry_init(entry, strhash(cmd));
> +
> + err = start_multi_file_filter_fn(entry);
> + if (err) {
>   error("initialization for external filter '%s' failed", cmd);
>   kill_multi_file_filter(hashmap, entry);
>   return NULL;
> -- 
> 2.12.0.windows.1.31.g1548525701.dirty

Looks good to me.

Thanks,
Lars



Re: [PATCH v5 2/8] convert: move packet_write_list() into pkt-line as packet_writel()

2017-04-09 Thread Lars Schneider

> On 07 Apr 2017, at 14:03, Ben Peart  wrote:
> 
> Add packet_writel() which writes multiple lines in a single call and
> then calls packet_flush_gently(). Update convert.c to use the new
> packet_writel() function from pkt-line.
> 
> Signed-off-by: Ben Peart 
> ---
> convert.c  | 23 ++-
> pkt-line.c | 19 +++
> pkt-line.h |  1 +
> 3 files changed, 22 insertions(+), 21 deletions(-)
> 
> diff --git a/convert.c b/convert.c
> index 8d652bf27c..793c29ebfd 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -521,25 +521,6 @@ static struct cmd2process 
> *find_multi_file_filter_entry(struct hashmap *hashmap,
>   return hashmap_get(hashmap, , NULL);
> }
> 
> -static int packet_write_list(int fd, const char *line, ...)
> -{
> - va_list args;
> - int err;
> - va_start(args, line);
> - for (;;) {
> - if (!line)
> - break;
> - if (strlen(line) > LARGE_PACKET_DATA_MAX)
> - return -1;
> - err = packet_write_fmt_gently(fd, "%s\n", line);
> - if (err)
> - return err;
> - line = va_arg(args, const char*);
> - }
> - va_end(args);
> - return packet_flush_gently(fd);
> -}
> -
> static void read_multi_file_filter_status(int fd, struct strbuf *status)
> {
>   struct strbuf **pair;
> @@ -616,7 +597,7 @@ static struct cmd2process *start_multi_file_filter(struct 
> hashmap *hashmap, cons
> 
>   sigchain_push(SIGPIPE, SIG_IGN);
> 
> - err = packet_write_list(process->in, "git-filter-client", "version=2", 
> NULL);
> + err = packet_writel(process->in, "git-filter-client", "version=2", 
> NULL);
>   if (err)
>   goto done;
> 
> @@ -632,7 +613,7 @@ static struct cmd2process *start_multi_file_filter(struct 
> hashmap *hashmap, cons
>   if (err)
>   goto done;
> 
> - err = packet_write_list(process->in, "capability=clean", 
> "capability=smudge", NULL);
> + err = packet_writel(process->in, "capability=clean", 
> "capability=smudge", NULL);
> 
>   for (;;) {
>   cap_buf = packet_read_line(process->out, NULL);
> diff --git a/pkt-line.c b/pkt-line.c
> index 58842544b4..2788aa1af6 100644
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -171,6 +171,25 @@ int packet_write_fmt_gently(int fd, const char *fmt, ...)
>   return status;
> }
> 
> +int packet_writel(int fd, const char *line, ...)
> +{
> + va_list args;
> + int err;
> + va_start(args, line);
> + for (;;) {
> + if (!line)
> + break;
> + if (strlen(line) > LARGE_PACKET_DATA_MAX)
> + return -1;
> + err = packet_write_fmt_gently(fd, "%s\n", line);
> + if (err)
> + return err;
> + line = va_arg(args, const char*);
> + }
> + va_end(args);
> + return packet_flush_gently(fd);
> +}
> +
> static int packet_write_gently(const int fd_out, const char *buf, size_t size)
> {
>   static char packet_write_buffer[LARGE_PACKET_MAX];
> diff --git a/pkt-line.h b/pkt-line.h
> index 12b18991f6..cb3eda9695 100644
> --- a/pkt-line.h
> +++ b/pkt-line.h
> @@ -25,6 +25,7 @@ void packet_buf_flush(struct strbuf *buf);
> void packet_buf_write(struct strbuf *buf, const char *fmt, ...) 
> __attribute__((format (printf, 2, 3)));
> int packet_flush_gently(int fd);
> int packet_write_fmt_gently(int fd, const char *fmt, ...) 
> __attribute__((format (printf, 2, 3)));
> +int packet_writel(int fd, const char *line, ...);
> int write_packetized_from_fd(int fd_in, int fd_out);
> int write_packetized_from_buf(const char *src_in, size_t len, int fd_out);
> 
> -- 
> 2.12.0.windows.1.31.g1548525701.dirty

Looks good to me.

Thanks,
Lars


Re: [PATCH v5 1/8] pkt-line: add packet_read_line_gently()

2017-04-09 Thread Lars Schneider

> On 07 Apr 2017, at 14:03, Ben Peart  wrote:
> 
> Add packet_read_line_gently() to enable reading a line without dying on
> EOF.
> 
> Signed-off-by: Ben Peart 
> ---
> pkt-line.c | 12 
> pkt-line.h | 10 ++
> 2 files changed, 22 insertions(+)
> 
> diff --git a/pkt-line.c b/pkt-line.c
> index d4b6bfe076..58842544b4 100644
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -323,6 +323,18 @@ char *packet_read_line(int fd, int *len_p)
>   return packet_read_line_generic(fd, NULL, NULL, len_p);
> }
> 
> +int packet_read_line_gently(int fd, int *dst_len, char** dst_line)
> +{
> + int len = packet_read(fd, NULL, NULL,
> + packet_buffer, sizeof(packet_buffer),
> + PACKET_READ_CHOMP_NEWLINE|PACKET_READ_GENTLE_ON_EOF);

Really minor nit: you could align the parameters according to fd
similar to the "packet_read_line_generic" code (tab width 8).

> + if (dst_len)
> + *dst_len = len;
> + if (dst_line)
> + *dst_line = (len > 0) ? packet_buffer : NULL;

Minor nit: The explicit check "len > 0" is necessary here as len can
be "-1". The original "packet_read_line_generic" just checks for 
"len". I think it would be nice if the code would be consistent and both
would check "len > 0".

> + return len;
> +}
> +
> char *packet_read_line_buf(char **src, size_t *src_len, int *dst_len)
> {
>   return packet_read_line_generic(-1, src, src_len, dst_len);
> diff --git a/pkt-line.h b/pkt-line.h
> index 18eac64830..12b18991f6 100644
> --- a/pkt-line.h
> +++ b/pkt-line.h
> @@ -74,6 +74,16 @@ int packet_read(int fd, char **src_buffer, size_t 
> *src_len, char
> char *packet_read_line(int fd, int *size);
> 
> /*
> + * Convenience wrapper for packet_read that sets the 
> PACKET_READ_GENTLE_ON_EOF
> + * and CHOMP_NEWLINE options. The return value specifies the number of bytes
> + * read into the buffer or -1 on truncated input. if the *dst_line parameter

s/if/If/


> + * is not NULL it will return NULL for a flush packet and otherwise points to

This sentences is a bit confusing to me. Maybe:

If the *dst_line parameter is not NULL, then it will point to a static buffer
(that may be overwritten by subsequent calls) or it will return NULL for a 
flush 
packet.

... feel free to completely ignore this as I am no native speaker.

> + * a static buffer (that may be overwritten by subsequent calls). If the size
> + * parameter is not NULL, the length of the packet is written to it.
> + */
> +int packet_read_line_gently(int fd, int *size, char** dst_line);
> +
> +/*
>  * Same as packet_read_line, but read from a buf rather than a descriptor;
>  * see packet_read for details on how src_* is used.
>  */

If you send another round then you could address the minor nits.
If not, then this patch as it is looks good to me.

Thanks,
Lars

[PATCH 2/2] test-lib: exhaustively insert non-alnum ASCII into the TRASH_DIRECTORY name

2017-04-09 Thread Ævar Arnfjörð Bjarmason
Change the test library to insert non-alphanumeric ASCII characters
into the TRASH_DIRECTORY name, that's the directory the test library
creates, chdirs to and runs each individual test from.

Unless test_fails_on_unusual_directory_names=1 is declared before
importing test-lib.sh (and if perl isn't available on the system), the
trash directory will contain every non-alphanumeric character in
ASCII, in order.

This includes all the control characters, !, [], {} etc. the "."
character isn't included because it's already in the directory name,
and nor is "/" for obvious reasons, although that would actually work,
we'd just create a subdirectory, which would make the tests harder to
inspect when they fail.i

This change is inspired by the "submodule: prevent backslash expantion
in submodule names" patch[1]. If we'd had backslashes in the
TRASH_DIRECTORY all along that bug would have been fixed a long time
ago. This will flag such issues by marking tests that currently fail
with "test_fails_on_unusual_directory_names=1", ensure that new tests
aren't added unless a discussion is had about why the code can't
handle unusual pathnames, and prevent future regressions.

1. <20170407172306.172673-1-bmw...@google.com>
---
 t/README  | 12 
 t/test-lib.sh |  4 
 2 files changed, 16 insertions(+)

diff --git a/t/README b/t/README
index ab386c3681..314dd40221 100644
--- a/t/README
+++ b/t/README
@@ -345,6 +345,18 @@ assignment to variable 'test_description', like this:
This test registers the following structure in the cache
and tries to run git-ls-files with option --frotz.'
 
+By default the tests will be run from a directory with a highly
+unusual filename that includes control characters, a newline, various
+punctuation etc., this is done to smoke out any bugs related to path
+handling. If for whatever reason the tests can't deal with such
+unusual path names, set:
+
+test_fails_on_unusual_directory_names=1
+
+Before sourcing 'test-lib.sh' as described below. This option is
+mainly intended to grandfather in existing broken tests & code, and
+should usually not be used in new code, instead your tests or code
+probably need fixing.
 
 Source 'test-lib.sh'
 
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 13b5696822..089ff5ac7d 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -914,6 +914,10 @@ fi
 
 # Test repository
 TRASH_DIRECTORY="trash directory.$(basename "$0" .sh)"
+if test -z "$test_fails_on_unusual_directory_names" -a "$(perl -e 'print 1+1' 
2>/dev/null)" = "2"
+then
+   TRASH_DIRECTORY="$TRASH_DIRECTORY.$(perl -e 'print join q[], grep { 
/[^[:alnum:]]/ and !m<[./]> } map chr, 0x01..0x7f')"
+fi
 test -n "$root" && TRASH_DIRECTORY="$root/$TRASH_DIRECTORY"
 case "$TRASH_DIRECTORY" in
 /*) ;; # absolute path is good
-- 
2.11.0



[PATCH v3 2/4] t0021: make debug log file name configurable

2017-04-09 Thread Lars Schneider
The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log".
Make this configurable by defining the log file as first parameter of
"rot13-filter.pl".

This is useful if "rot13-filter.pl" is configured multiple times similar to the
subsequent patch 'convert: add "status=delayed" to filter process protocol'.

Signed-off-by: Lars Schneider 
---
 t/t0021-conversion.sh   | 44 ++--
 t/t0021/rot13-filter.pl |  8 +---
 2 files changed, 27 insertions(+), 25 deletions(-)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index ff2424225b..0139b460e7 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -28,7 +28,7 @@ file_size () {
 }
 
 filter_git () {
-   rm -f rot13-filter.log &&
+   rm -f *.log &&
git "$@"
 }
 
@@ -342,7 +342,7 @@ test_expect_success 'diff does not reuse worktree files 
that need cleaning' '
 '
 
 test_expect_success PERL 'required process filter should filter data' '
-   test_config_global filter.protocol.process "rot13-filter.pl clean 
smudge" &&
+   test_config_global filter.protocol.process "rot13-filter.pl debug.log 
clean smudge" &&
test_config_global filter.protocol.required true &&
rm -rf repo &&
mkdir repo &&
@@ -375,7 +375,7 @@ test_expect_success PERL 'required process filter should 
filter data' '
IN: clean testsubdir/test3 '\''sq'\'',\$x=.r $S3 [OK] 
-- OUT: $S3 . [OK]
STOP
EOF
-   test_cmp_count expected.log rot13-filter.log &&
+   test_cmp_count expected.log debug.log &&
 
git commit -m "test commit 2" &&
rm -f test2.r "testsubdir/test3 '\''sq'\'',\$x=.r" &&
@@ -388,7 +388,7 @@ test_expect_success PERL 'required process filter should 
filter data' '
IN: smudge testsubdir/test3 '\''sq'\'',\$x=.r $S3 [OK] 
-- OUT: $S3 . [OK]
STOP
EOF
-   test_cmp_exclude_clean expected.log rot13-filter.log &&
+   test_cmp_exclude_clean expected.log debug.log &&
 
filter_git checkout --quiet --no-progress empty-branch &&
cat >expected.log <<-EOF &&
@@ -397,7 +397,7 @@ test_expect_success PERL 'required process filter should 
filter data' '
IN: clean test.r $S [OK] -- OUT: $S . [OK]
STOP
EOF
-   test_cmp_exclude_clean expected.log rot13-filter.log &&
+   test_cmp_exclude_clean expected.log debug.log &&
 
filter_git checkout --quiet --no-progress master &&
cat >expected.log <<-EOF &&
@@ -409,7 +409,7 @@ test_expect_success PERL 'required process filter should 
filter data' '
IN: smudge testsubdir/test3 '\''sq'\'',\$x=.r $S3 [OK] 
-- OUT: $S3 . [OK]
STOP
EOF
-   test_cmp_exclude_clean expected.log rot13-filter.log &&
+   test_cmp_exclude_clean expected.log debug.log &&
 
test_cmp_committed_rot13 "$TEST_ROOT/test.o" test.r &&
test_cmp_committed_rot13 "$TEST_ROOT/test2.o" test2.r &&
@@ -419,7 +419,7 @@ test_expect_success PERL 'required process filter should 
filter data' '
 
 test_expect_success PERL 'required process filter takes precedence' '
test_config_global filter.protocol.clean false &&
-   test_config_global filter.protocol.process "rot13-filter.pl clean" &&
+   test_config_global filter.protocol.process "rot13-filter.pl debug.log 
clean" &&
test_config_global filter.protocol.required true &&
rm -rf repo &&
mkdir repo &&
@@ -439,12 +439,12 @@ test_expect_success PERL 'required process filter takes 
precedence' '
IN: clean test.r $S [OK] -- OUT: $S . [OK]
STOP
EOF
-   test_cmp_count expected.log rot13-filter.log
+   test_cmp_count expected.log debug.log
)
 '
 
 test_expect_success PERL 'required process filter should be used only for 
"clean" operation only' '
-   test_config_global filter.protocol.process "rot13-filter.pl clean" &&
+   test_config_global filter.protocol.process "rot13-filter.pl debug.log 
clean" &&
rm -rf repo &&
mkdir repo &&
(
@@ -462,7 +462,7 @@ test_expect_success PERL 'required process filter should be 
used only for "clean
IN: clean test.r $S [OK] -- OUT: $S . [OK]
STOP
EOF
-   test_cmp_count expected.log rot13-filter.log &&
+   test_cmp_count expected.log debug.log &&
 
rm test.r &&
 
@@ -474,12 +474,12 @@ test_expect_success PERL 'required process filter should 
be used only for "clean
init handshake complete
STOP

[PATCH v3 1/4] t0021: keep filter log files on comparison

2017-04-09 Thread Lars Schneider
The filter log files are modified on comparison. Write the modified log files
to temp files for comparison to fix this.

This is useful for the subsequent patch 'convert: add "status=delayed" to
filter process protocol'.

Signed-off-by: Lars Schneider 
---
 t/t0021-conversion.sh | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 161f560446..ff2424225b 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -42,10 +42,10 @@ test_cmp_count () {
for FILE in "$expect" "$actual"
do
sort "$FILE" | uniq -c |
-   sed -e "s/^ *[0-9][0-9]*[   ]*IN: /x IN: /" >"$FILE.tmp" &&
-   mv "$FILE.tmp" "$FILE" || return
+   sed -e "s/^ *[0-9][0-9]*[   ]*IN: /x IN: /" >"$FILE.tmp"
done &&
-   test_cmp "$expect" "$actual"
+   test_cmp "$expect.tmp" "$actual.tmp" &&
+   rm "$expect.tmp" "$actual.tmp"
 }
 
 # Compare two files but exclude all `clean` invocations because Git can
@@ -56,10 +56,10 @@ test_cmp_exclude_clean () {
actual=$2
for FILE in "$expect" "$actual"
do
-   grep -v "IN: clean" "$FILE" >"$FILE.tmp" &&
-   mv "$FILE.tmp" "$FILE"
+   grep -v "IN: clean" "$FILE" >"$FILE.tmp"
done &&
-   test_cmp "$expect" "$actual"
+   test_cmp "$expect.tmp" "$actual.tmp" &&
+   rm "$expect.tmp" "$actual.tmp"
 }
 
 # Check that the contents of two files are equal and that their rot13 version
-- 
2.12.2



[PATCH v3 4/4] convert: add "status=delayed" to filter process protocol

2017-04-09 Thread Lars Schneider
Some `clean` / `smudge` filters might require a significant amount of
time to process a single blob. During this process the Git checkout
operation is blocked and Git needs to wait until the filter is done to
continue with the checkout.

Teach the filter process protocol (introduced in edcc858) to accept the
status "delayed" as response to a filter request. Upon this response Git
continues with the checkout operation. After the checkout operation Git
calls "finish_delayed_checkout" which queries the filter for remaining
blobs. If the filter is still working on the completion, then the filter
is expected to block. If the filter has completed all remaining blobs
then an empty response is expected.

Git has a multiple code paths that checkout a blob. Support delayed
checkouts only in `clone` (in unpack-trees.c) and `checkout` operations.

Signed-off-by: Lars Schneider 
---
 Documentation/gitattributes.txt |  73 +-
 builtin/checkout.c  |   3 +
 cache.h |  37 ++-
 convert.c   | 150 
 convert.h   |   5 +
 entry.c | 124 +++-
 t/t0021-conversion.sh   |  73 ++
 t/t0021/rot13-filter.pl | 210 
 unpack-trees.c  |   2 +
 9 files changed, 587 insertions(+), 90 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index e0b66c1220..329baa945f 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -425,8 +425,8 @@ packet:  git< capability=clean
 packet:  git< capability=smudge
 packet:  git< 
 
-Supported filter capabilities in version 2 are "clean" and
-"smudge".
+Supported filter capabilities in version 2 are "clean", "smudge",
+and "delay".
 
 Afterwards Git sends a list of "key=value" pairs terminated with
 a flush packet. The list will contain at least the filter command
@@ -512,12 +512,77 @@ the protocol then Git will stop the filter process and 
restart it
 with the next file that needs to be processed. Depending on the
 `filter..required` flag Git will interpret that as error.
 
-After the filter has processed a blob it is expected to wait for
-the next "key=value" list containing a command. Git will close
+After the filter has processed a command it is expected to wait for
+a "key=value" list containing the next command. Git will close
 the command pipe on exit. The filter is expected to detect EOF
 and exit gracefully on its own. Git will wait until the filter
 process has stopped.
 
+Delay
+^
+
+If the filter supports the "delay" capability, then Git can send the
+flag "delay-able" after the filter command and pathname. This flag
+denotes that the filter can delay filtering the current blob (e.g. to
+compensate network latencies) by responding with no content but with
+the status "delayed" and a flush packet. Git will answer with a
+"delay-id", a number that identifies the blob, and a flush packet. The
+filter acknowledges this number with a "success" status and a flush
+packet.
+
+packet:  git> command=smudge
+packet:  git> pathname=path/testfile.dat
+packet:  git> delay-able=1
+packet:  git> 
+packet:  git> CONTENT
+packet:  git> 
+packet:  git< status=delayed
+packet:  git< 
+packet:  git> delay-id=1
+packet:  git> 
+packet:  git< status=success
+packet:  git< 
+
+
+If the filter supports the "delay" capability then it must support the
+"list_available_blobs" command. If Git sends this command, then the
+filter is expected to return a list of "delay_ids" of blobs that are
+available. The list must be terminated with a flush packet followed
+by a "success" status that is also terminated with a flush packet. If
+no blobs for the delayed paths are available, yet, then the filter is
+expected to block the response until at least one blob becomes
+available. The filter can tell Git that it has no more delayed blobs
+by sending an empty list.
+
+packet:  git> command=list_available_blobs
+packet:  git> 
+packet:  git< 7
+packet:  git< 13
+packet:  git< 
+packet:  git< status=success
+packet:  git< 
+
+
+After Git received the "delay_ids", it will request the corresponding
+blobs again. These requests contain a "delay-id" and an empty content
+section. The filter is expected to respond with the smudged content
+in the usual way as explained above.
+
+packet:  git> command=smudge
+packet:  git> pathname=test-delay10.a
+packet:  git> delay-id=0
+packet:  git> 
+packet:  git>   # 

[PATCH 1/2] tests: mark tests that fail when the TEST_DIRECTORY is unusual

2017-04-09 Thread Ævar Arnfjörð Bjarmason
Mark tests that fail when the TEST_DIRECTORY has a name that's
unusual, i.e. contains control characters, various punctuation etc.

This change is followed by a change that adds the code that picks up
test_fails_on_unusual_directory_names to test-lib.sh, but comes before
that change to not break most of the test suite during bisecting.

This change is the result of:

for f in $(grep -B3 last_fail_time .prove |grep sh | perl -pe 's/^  
(.*?):/$1/');
do perl -0666 -pi -e 
's/^(test_description)/test_fails_on_unusual_directory_names=1\n$1/m' $f
done

After running the test suite with the aforementioned change to
test-lib.sh. The only exception is t-basic.sh, since its testing
the test suite itself its failing 'test --verbose' test needs a change
to the test suite wrapper code to be excluded.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t-basic.sh | 2 ++
 t/t0003-attributes.sh| 1 +
 t/t0021-conversion.sh| 1 +
 t/t0060-path-utils.sh| 1 +
 t/t0300-credentials.sh   | 1 +
 t/t0302-credential-store.sh  | 1 +
 t/t1006-cat-file.sh  | 1 +
 t/t1013-read-tree-submodule.sh   | 1 +
 t/t1020-subdirectory.sh  | 1 +
 t/t1050-large.sh | 1 +
 t/t1060-object-corruption.sh | 1 +
 t/t1300-repo-config.sh   | 1 +
 t/t1305-config-include.sh| 1 +
 t/t1308-config-set.sh| 1 +
 t/t1309-early-config.sh  | 1 +
 t/t1500-rev-parse.sh | 1 +
 t/t1504-ceiling-dirs.sh  | 1 +
 t/t1507-rev-parse-upstream.sh| 1 +
 t/t1510-repo-setup.sh| 1 +
 t/t1515-rev-parse-outside-repo.sh| 1 +
 t/t2013-checkout-submodule.sh| 1 +
 t/t2025-worktree-add.sh  | 1 +
 t/t2027-worktree-list.sh | 1 +
 t/t2028-worktree-move.sh | 1 +
 t/t3040-subprojects-basic.sh | 1 +
 t/t3050-subprojects-fetch.sh | 1 +
 t/t3409-rebase-preserve-merges.sh| 1 +
 t/t3426-rebase-submodule.sh  | 1 +
 t/t3501-revert-cherry-pick.sh| 1 +
 t/t3512-cherry-pick-submodule.sh | 1 +
 t/t3513-revert-submodule.sh  | 1 +
 t/t3600-rm.sh| 1 +
 t/t3900-i18n-commit.sh   | 1 +
 t/t3906-stash-submodule.sh   | 1 +
 t/t4027-diff-submodule.sh| 1 +
 t/t4030-diff-textconv.sh | 1 +
 t/t4031-diff-rewrite-binary.sh   | 1 +
 t/t4035-diff-quiet.sh| 1 +
 t/t4041-diff-submodule-option.sh | 1 +
 t/t4059-diff-submodule-not-initialized.sh| 1 +
 t/t4060-diff-submodule-option-diff-format.sh | 1 +
 t/t4137-apply-submodule.sh   | 1 +
 t/t4203-mailmap.sh   | 1 +
 t/t4207-log-decoration-colors.sh | 1 +
 t/t4255-am-submodule.sh  | 1 +
 t/t5000-tar-tree.sh  | 1 +
 t/t5001-archive-attr.sh  | 1 +
 t/t5002-archive-attr-pattern.sh  | 1 +
 t/t5003-archive-zip.sh   | 1 +
 t/t5150-request-pull.sh  | 1 +
 t/t5300-pack-object.sh   | 1 +
 t/t5304-prune.sh | 1 +
 t/t5305-include-tag.sh   | 1 +
 t/t5306-pack-nobase.sh   | 1 +
 t/t5310-pack-bitmaps.sh  | 1 +
 t/t5311-pack-bitmaps-shallow.sh  | 1 +
 t/t5400-send-pack.sh | 1 +
 t/t5401-update-hooks.sh  | 1 +
 t/t5402-post-merge-hook.sh   | 1 +
 t/t5403-post-checkout-hook.sh| 1 +
 t/t5404-tracking-branches.sh | 1 +
 t/t5406-remote-rejects.sh| 1 +
 t/t5407-post-rewrite-hook.sh | 1 +
 t/t5500-fetch-pack.sh| 1 +
 t/t5501-fetch-push-alternates.sh | 1 +
 t/t5502-quickfetch.sh| 1 +
 t/t5505-remote.sh| 1 +
 t/t5509-fetch-push-namespaces.sh | 1 +
 t/t5510-fetch.sh | 1 +
 t/t5512-ls-remote.sh | 1 +
 t/t5514-fetch-multiple.sh| 1 +
 t/t5515-fetch-merge-logic.sh | 1 +
 t/t5516-fetch-push.sh| 1 +
 

[PATCH v3 3/4] t0021: write "OUT" only on success

2017-04-09 Thread Lars Schneider
"rot13-filter.pl" used to write "OUT " to the debug log even in case of
an abort or error. Fix this by writing "OUT " to the debug log only in
the successful case if output is actually written.

This is useful for the subsequent patch 'convert: add "status=delayed" to
filter process protocol'.

Signed-off-by: Lars Schneider 
---
 t/t0021-conversion.sh   | 6 +++---
 t/t0021/rot13-filter.pl | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 0139b460e7..0c04d346a1 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -588,7 +588,7 @@ test_expect_success PERL 'process filter should restart 
after unexpected write f
cat >expected.log <<-EOF &&
START
init handshake complete
-   IN: smudge smudge-write-fail.r $SF [OK] -- OUT: $SF 
[WRITE FAIL]
+   IN: smudge smudge-write-fail.r $SF [OK] -- [WRITE FAIL]
START
init handshake complete
IN: smudge test.r $S [OK] -- OUT: $S . [OK]
@@ -634,7 +634,7 @@ test_expect_success PERL 'process filter should not be 
restarted if it signals a
cat >expected.log <<-EOF &&
START
init handshake complete
-   IN: smudge error.r $SE [OK] -- OUT: 0 [ERROR]
+   IN: smudge error.r $SE [OK] -- [ERROR]
IN: smudge test.r $S [OK] -- OUT: $S . [OK]
IN: smudge test2.r $S2 [OK] -- OUT: $S2 . [OK]
STOP
@@ -673,7 +673,7 @@ test_expect_success PERL 'process filter abort stops 
processing of all further f
cat >expected.log <<-EOF &&
START
init handshake complete
-   IN: smudge abort.r $SA [OK] -- OUT: 0 [ABORT]
+   IN: smudge abort.r $SA [OK] -- [ABORT]
STOP
EOF
test_cmp_exclude_clean expected.log debug.log &&
diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 0b943bb377..5e43faeec1 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -153,9 +153,6 @@ while (1) {
die "bad command '$command'";
}
 
-   print $debug "OUT: " . length($output) . " ";
-   $debug->flush();
-
if ( $pathname eq "error.r" ) {
print $debug "[ERROR]\n";
$debug->flush();
@@ -178,6 +175,9 @@ while (1) {
die "${command} write error";
}
 
+   print $debug "OUT: " . length($output) . " ";
+   $debug->flush();
+
while ( length($output) > 0 ) {
my $packet = substr( $output, 0, 
$MAX_PACKET_CONTENT_SIZE );
packet_bin_write($packet);
-- 
2.12.2



[PATCH v3 0/4] convert: add "status=delayed" to filter process protocol

2017-04-09 Thread Lars Schneider
Hi,

in v3 "delay filter" became a series. Patch 1 to 3 are minor t0021 test
adjustments and patch 4 is the actual change.

Most significant change since v2:
If the filter delays a blob, then Git send the filter a "delay-id". Git uses
this "delay-id" as index in an array of delayed "cached entries". When Git
requests a previously delayed blob then it will only send the "delay-id"
to identify the blob. The actual blob content will not be send to the filter,
again.

If you review this series then please read the "Delay" section in
"Documentation/gitattributes.txt" first for an overview of the delay mechanism.
The changes in "t/t0021/rot13-filter.pl" are easier to review if you ignore
whitespace changes.

Thanks,
Lars

RFC: http://public-inbox.org/git/d10f7c47-14e8-465b-8b7a-a09a1b28a...@gmail.com/
v1: http://public-inbox.org/git/20170108191736.47359-1-larsxschnei...@gmail.com/
v2: http://public-inbox.org/git/20170226184816.30010-1-larsxschnei...@gmail.com/


Base Ref: v2.12.0
Web-Diff: https://github.com/larsxschneider/git/commit/08a461f103
Checkout: git fetch https://github.com/larsxschneider/git 
filter-process/delay-v3 && git checkout 08a461f103


Interdiff (v2..v3):

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index f6bad8db40..329baa945f 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -425,8 +425,8 @@ packet:  git< capability=clean
 packet:  git< capability=smudge
 packet:  git< 
 
-Supported filter capabilities in version 2 are "clean" and
-"smudge".
+Supported filter capabilities in version 2 are "clean", "smudge",
+and "delay".

 Afterwards Git sends a list of "key=value" pairs terminated with
 a flush packet. The list will contain at least the filter command
@@ -473,15 +473,6 @@ packet:  git<   # empty content!
 packet:  git<   # empty list, keep "status=success" unchanged!
 

-If the request cannot be fulfilled within a reasonable amount of time
-then the filter can respond with a "delayed" status and a flush packet.
-Git will perform the same request at a later point in time, again. The
-filter can delay a response multiple times for a single request.
-
-packet:  git< status=delayed
-packet:  git< 
-
-
 In case the filter cannot or does not want to process the content,
 it is expected to respond with an "error" status.
 
@@ -521,12 +512,77 @@ the protocol then Git will stop the filter process and 
restart it
 with the next file that needs to be processed. Depending on the
 `filter..required` flag Git will interpret that as error.

-After the filter has processed a blob it is expected to wait for
-the next "key=value" list containing a command. Git will close
+After the filter has processed a command it is expected to wait for
+a "key=value" list containing the next command. Git will close
 the command pipe on exit. The filter is expected to detect EOF
 and exit gracefully on its own. Git will wait until the filter
 process has stopped.

+Delay
+^
+
+If the filter supports the "delay" capability, then Git can send the
+flag "delay-able" after the filter command and pathname. This flag
+denotes that the filter can delay filtering the current blob (e.g. to
+compensate network latencies) by responding with no content but with
+the status "delayed" and a flush packet. Git will answer with a
+"delay-id", a number that identifies the blob, and a flush packet. The
+filter acknowledges this number with a "success" status and a flush
+packet.
+
+packet:  git> command=smudge
+packet:  git> pathname=path/testfile.dat
+packet:  git> delay-able=1
+packet:  git> 
+packet:  git> CONTENT
+packet:  git> 
+packet:  git< status=delayed
+packet:  git< 
+packet:  git> delay-id=1
+packet:  git> 
+packet:  git< status=success
+packet:  git< 
+
+
+If the filter supports the "delay" capability then it must support the
+"list_available_blobs" command. If Git sends this command, then the
+filter is expected to return a list of "delay_ids" of blobs that are
+available. The list must be terminated with a flush packet followed
+by a "success" status that is also terminated with a flush packet. If
+no blobs for the delayed paths are available, yet, then the filter is
+expected to block the response until at least one blob becomes
+available. The filter can tell Git that it has no more delayed blobs
+by sending an empty list.
+
+packet:  git> command=list_available_blobs
+packet:  git> 
+packet:  git< 7
+packet:  git< 13
+packet:  git< 
+packet:  git< status=success
+packet:  git< 
+
+

[PATCH 0/2] test: Detect *lots* of bugs by adding non-alnum to trash dir names

2017-04-09 Thread Ævar Arnfjörð Bjarmason
There's a patch now on the ML for an issue with read v.s. read -r in
submodule revealed when the submodule name contains a \.

That fix is fine, but we should do better and structurally arrange our
tests to detect these sorts of issues before they're reported.

This series changes the test library so that we interpolate the result
of:

perl -e 'print join q[], grep { /[^[:alnum:]]/ and !m<[./]> } map chr, 
0x01..0x7f'

Into the trash directory name we generate. Doing this makes 30% of the
test suite fail. Most of the diffstat below is just adding:

test_fails_on_unusual_directory_names=1

To those tests that failed, which makes them not use these garbage
trash directory names.

Some of those failures are legitimate bugs, some are an artifact of us
using shellscripting as our test language (namely interpolating ' and
" somewhere).

It might make sense to mark these tests more granularly, e.g.:

test_fails_on_unusual_directory_name_types=quotes

etc., if the test fails just because of " or ', but I think as it is
this makes sense for inclusion, it makes sure we don't regress on the
remaining 70% of our test suite.

Ævar Arnfjörð Bjarmason (2):
  tests: mark tests that fail when the TEST_DIRECTORY is unusual
  test-lib: exhaustively insert non-alnum ASCII into the TRASH_DIRECTORY
name

 t/README | 12 
 t/t-basic.sh |  2 ++
 t/t0003-attributes.sh|  1 +
 t/t0021-conversion.sh|  1 +
 t/t0060-path-utils.sh|  1 +
 t/t0300-credentials.sh   |  1 +
 t/t0302-credential-store.sh  |  1 +
 t/t1006-cat-file.sh  |  1 +
 t/t1013-read-tree-submodule.sh   |  1 +
 t/t1020-subdirectory.sh  |  1 +
 t/t1050-large.sh |  1 +
 t/t1060-object-corruption.sh |  1 +
 t/t1300-repo-config.sh   |  1 +
 t/t1305-config-include.sh|  1 +
 t/t1308-config-set.sh|  1 +
 t/t1309-early-config.sh  |  1 +
 t/t1500-rev-parse.sh |  1 +
 t/t1504-ceiling-dirs.sh  |  1 +
 t/t1507-rev-parse-upstream.sh|  1 +
 t/t1510-repo-setup.sh|  1 +
 t/t1515-rev-parse-outside-repo.sh|  1 +
 t/t2013-checkout-submodule.sh|  1 +
 t/t2025-worktree-add.sh  |  1 +
 t/t2027-worktree-list.sh |  1 +
 t/t2028-worktree-move.sh |  1 +
 t/t3040-subprojects-basic.sh |  1 +
 t/t3050-subprojects-fetch.sh |  1 +
 t/t3409-rebase-preserve-merges.sh|  1 +
 t/t3426-rebase-submodule.sh  |  1 +
 t/t3501-revert-cherry-pick.sh|  1 +
 t/t3512-cherry-pick-submodule.sh |  1 +
 t/t3513-revert-submodule.sh  |  1 +
 t/t3600-rm.sh|  1 +
 t/t3900-i18n-commit.sh   |  1 +
 t/t3906-stash-submodule.sh   |  1 +
 t/t4027-diff-submodule.sh|  1 +
 t/t4030-diff-textconv.sh |  1 +
 t/t4031-diff-rewrite-binary.sh   |  1 +
 t/t4035-diff-quiet.sh|  1 +
 t/t4041-diff-submodule-option.sh |  1 +
 t/t4059-diff-submodule-not-initialized.sh|  1 +
 t/t4060-diff-submodule-option-diff-format.sh |  1 +
 t/t4137-apply-submodule.sh   |  1 +
 t/t4203-mailmap.sh   |  1 +
 t/t4207-log-decoration-colors.sh |  1 +
 t/t4255-am-submodule.sh  |  1 +
 t/t5000-tar-tree.sh  |  1 +
 t/t5001-archive-attr.sh  |  1 +
 t/t5002-archive-attr-pattern.sh  |  1 +
 t/t5003-archive-zip.sh   |  1 +
 t/t5150-request-pull.sh  |  1 +
 t/t5300-pack-object.sh   |  1 +
 t/t5304-prune.sh |  1 +
 t/t5305-include-tag.sh   |  1 +
 t/t5306-pack-nobase.sh   |  1 +
 t/t5310-pack-bitmaps.sh  |  1 +
 t/t5311-pack-bitmaps-shallow.sh  |  1 +
 t/t5400-send-pack.sh |  1 +
 t/t5401-update-hooks.sh  |  1 +
 t/t5402-post-merge-hook.sh   |  1 +
 t/t5403-post-checkout-hook.sh|  1 +
 t/t5404-tracking-branches.sh |  1 +
 t/t5406-remote-rejects.sh|  1 +
 t/t5407-post-rewrite-hook.sh |  1 +
 

Re: [PATCH v2] convert: add "status=delayed" to filter process protocol

2017-04-09 Thread Lars Schneider

> On 27 Feb 2017, at 23:11, Jakub Narębski  wrote:
> 
> W dniu 27.02.2017 o 11:32, Lars Schneider pisze:
>> 
>>> On 27 Feb 2017, at 10:58, Jeff King  wrote:
>>> 
>>> On Sun, Feb 26, 2017 at 07:48:16PM +0100, Lars Schneider wrote:
>>> 
 +If the request cannot be fulfilled within a reasonable amount of time
 +then the filter can respond with a "delayed" status and a flush packet.
 +Git will perform the same request at a later point in time, again. The
 +filter can delay a response multiple times for a single request.
 +
 +packet:  git< status=delayed
 +packet:  git< 
 +
> 
> Is it something that happens instead of filter process sending the contents

Correct! I'll clarify this in v3!


>> 
>> I completely agree - I need to change that. However, the goal of the v2
>> iteration was to get the "convert" interface in an acceptable state.
>> That's what I intended to say in the patch comment section:
>> 
>>"Please ignore all changes behind async_convert_to_working_tree() and 
>> async_filter_finish() for now as I plan to change the implementation 
>> as soon as the interface is in an acceptable state."
> 
> I think that it is more important to start with a good abstraction,
> and the proposal for protocol, rather than getting bogged down in
> implementation details that may change as the idea for protocol
> extension changes.

I'll send out v3 shortly as proposal for a complete solution.


>>> I think it would be much more efficient to do something like:
>>> 
>>> [Git issues a request and gives it an opaque index id]
>>> git> command=smudge
>>> git> pathname=foo
>>> git> index=0
>>> git> 
>>> git> CONTENT
>>> git> 
>>> 
>>> [The data isn't ready yet, so the filter tells us so...]
>>> git< status=delayed
>>> git< 
> 
> So is it only as replacement for "status=success" + contents or
> "status=abort", that is upfront before sending any part of the file?

Yes.


> Or, as one can assume from the point of the paragraph with the
> "status=delayed", it is about replacing null list for success or
> "status=error" after sending some part (maybe empty) of a file,
> that is:

No. As this would complicate things I don't want to support it. 
(and I clarified that in the docs in v3).


> If it would not be undue burden on the filter driver process, we might
> require for it to say where to continue at (in bytes), e.g.
> 
>git< from=16426
> 
> That should, of course, go below index/pathname line.

This would make the protocol even more complicated. That's why I don't
want to support splitting the response.


>>> git< index=0
> 
> Or a filter driver could have used pathname as an index, that is
> 
>git< pathname=path/testfile.dat

In v3 I've used an index to help Git finding the right cache entry
quickly.


> 
>>> git< 
>>> git< CONTENT
>>> git< 
>>> 
>>> From Git's side, the loop is something like:
>>> 
>>> while (delayed_items > 0) {
>>> /* issue a wait, and get back the status/index pair */
>>> status = send_wait();
>>> delayed_items--;
> 
> This looks like my 'event loop' proposal[1][2], see below.

I implemented something similar in v3.


>> That could work! I had something like that in mind:
>> 
>> I teach Git a new command "list_completed" or similar. The filter
>> blocks this call until at least one item is ready for Git. 
>> Then the filter responds with a list of paths that identify the
>> "ready items". Then Git asks for these ready items just with the
>> path and not with any content. Could that work? Wouldn't the path
>> be "unique" to identify a blob per filter run?
> 
> Why in the "drain" phase it is still Git that needs to ask filter for
> contents, one file after another?  Wouldn't it be easier and simpler
> for filter to finish sending contents, and send signal that it has
> finished continue'ing?
> 
> To summarize my earlier emails, current proposal looks for me as if
> it were a "busy loop" solution, that is[2]:

In v3 the implementation still uses kind of a busy loop (I expect the
filter to block if there nothing ready, yet). An event loop would
complicate the protocol as the filter would need to initiate an action.
Right now only Git initiates actions.


> Footnotes:
> --
> a) We don't send the Git-side contents of blob again, isn't it?
>   So we need some protocol extension / new understanding anyway.
>   for example that we don't send contents if we request path again.
Correct - v3 doesn't send the content again.


> Also, one thing that we need to be solved, assuming that the proposed
> extension allows to send partial data from filter to be delayed and
> continued later, is that Git needs to keep this partial response in buf;
> this is because of precedence of gitattributes applying:

As mentioned above I don't want to support partial data as this
complicates things and is of no use for my Git LFS problem case.


Re: [PATCH v2] convert: add "status=delayed" to filter process protocol

2017-04-09 Thread Lars Schneider

> On 27 Feb 2017, at 11:53, Jeff King  wrote:
> 
> On Mon, Feb 27, 2017 at 11:32:47AM +0100, Lars Schneider wrote:
> 
>> ...
> 
>>> From Git's side, the loop is something like:
>>> 
>>> while (delayed_items > 0) {
>>> /* issue a wait, and get back the status/index pair */
>>> status = send_wait();
>>> delayed_items--;
>>> 
>>> /*
>>>  * use "index" to find the right item in our list of files;
>>>  * the format can be opaque to the filter, so we could index
>>>  * it however we like. But probably numeric indices in an array
>>>  * are the simplest.
>>>  */
>>> assert(index > 0 && index < nr_items);
>>> item[index].status = status;
>>> if (status == SUCCESS)
>>> read_content([index]);
>>> }
>>> 
>>> and the filter side just attaches the "index" string to whatever its
>>> internal queue structure is, and feeds it back verbatim when processing
>>> that item finishes.
>> 
>> That could work! I had something like that in mind:
>> 
>> I teach Git a new command "list_completed" or similar. The filter
>> blocks this call until at least one item is ready for Git. 
>> Then the filter responds with a list of paths that identify the
>> "ready items". Then Git asks for these ready items just with the
>> path and not with any content. Could that work? Wouldn't the path
>> be "unique" to identify a blob per filter run?
> 
> I think that could work, though I think there are few minor downsides
> compared to what I wrote above:
> 
>  - if you respond with "these items are ready", and then make Git ask
>for each again, it's an extra round-trip for each set of ready
>items. You could just say "an item is ready; here it is" in a single
>response. For a local pipe the latency is probably negligible,
>though.

It is true that the extra round-trip is not strictly necessary but I think
it simplifies the protocol/the code as I can reuse the convert machinery 
as is.


>  - using paths as the index would probably work, but it means Git has
>to use the path to find the "struct checkout_entry" again. Which
>might mean a hashmap (though if you have them all in a sorted list,
>I guess you could also do a binary search).

Agreed. I changed my implementation to use an index following your
suggestion.

>  - Using an explicit index communicates to the filter not only what the
>index is, but also that Git is prepared to accept a delayed response
>for the item. For backwards compatibility, the filter would probably
>advertise "I have the 'delayed' capability", and then Git could
>choose to use it or not on a per-item basis. Realistically it would
>not change from item to item, but rather operation to operation. So
>that means we can easily convert the call-sites in Git to the async
>approach incrementally. As each one is converted, it turns on the
>flag that causes the filter code to send the "index" tag.

Agreed. I change the implementation accordingly and I will send out the
patches shortly.

Thanks,
Lars


Re: [PATCH] Make git log work for git CWD outside of work tree

2017-04-09 Thread Danny Sauer
On 4/9/2017 5:54 AM, Johannes Schindelin wrote:
> Hi Danny,
>
> On Sat, 8 Apr 2017, Danny Sauer wrote:
>> diff --git a/git.c b/git.c
>> index 8ff44f0..e147f01 100644
>> --- a/git.c
>> +++ b/git.c
>> @@ -440,7 +440,7 @@ static struct cmd_struct commands[] = {
>>  { "init", cmd_init_db },
>>  { "init-db", cmd_init_db },
>>  { "interpret-trailers", cmd_interpret_trailers, RUN_SETUP_GENTLY },
>> -{ "log", cmd_log, RUN_SETUP },
>> +{ "log", cmd_log, RUN_SETUP | NEED_WORK_TREE },
>>  { "ls-files", cmd_ls_files, RUN_SETUP | SUPPORT_SUPER_PREFIX },
>>  { "ls-remote", cmd_ls_remote, RUN_SETUP_GENTLY },
>>  { "ls-tree", cmd_ls_tree, RUN_SETUP },
> This may work for you, but it does not work for me, as I often call `git
> log` in a bare repository. And that call works, and it should keep
> working.

I was wondering about that situation, but honestly I wasn't positive how
to test it and sort of just went on the hope that the read_mailmap_blob
mechanism would continue handling it.  So, I guess there's another test
case which would be useful, as passing the test cases is what gave me
that undeserved confidence. :)  And I'm going to need to read a bit more
to figure out what a bare repository actually is...

> Instead, I think, you need to figure out why the .mailmap file is not read
> correctly when you use the GIT_DIR & GIT_WORK_TREE approach. My vague
> hunch is that you need to replace the ".mailmap" in read_mailmap()
> (defined in mailmap.c):
>
> err |= read_mailmap_file(map, ".mailmap", repo_abbrev);
>
> by something like mkpath("%s/%s", get_git_work_tree(), ".mailmap"), but
> probably only after testing that we're not in a bare repository (which
> would also fix a bug, I suspect, as `git log` in a bare repository
> probably heeds a .mailmap file in the current directory, which is
> incorrect). I.e. something like:
>
>   if (!is_bare_repository()) {
>   const char *path = mkpath("%s/%s",
> get_git_work_tree(), ".mailmap")
>   err |= read_mailmap_file(map, path, repo_abbrev);
>   }

That's feedback I was looking for - what's the "right" way to get the
path to the file.  The call as-is works in any subdirectory of a
repository, because there's some logic buried along the way which puts
you in the top level of the repo if you started underneath.  But if
you're outside, it does the same chdir to verify that the top level is a
real repo and then returns you to where you started (which is perfectly
reasonable).

In comparing how shortlog works v/s log, the main difference I could see
is the use of RUN_SETUP_GENTLY instead of RUN_SETUP as the flag, which
itself seems to only differ by passing in a true value for nongit_ok in
run_setup, ultimately leading to "some different directory checks" -
which I somewhat got lost in due partially to not fully knowing what a
bare repository is. :)  But changing that didn't make it work in my
(poorly-formalized) testing.

I'm on-board with a proper test case being the first situation which
needs rectifying.  It's off to learn stuff for me, then.  Thanks!

--Danny



Unexpected working directory in post-receive hook in non-bare repository

2017-04-09 Thread Simon Ruderich
Hello,

The following snippet reproduces the issue for me (note the
remote: line in its output):

git --version

rm -rf a b

git init a
cd a
echo first >data
git add data
git commit -m initial
cat >>.git/hooks/post-receive <>data
git add data
git commit -m test
git push origin master:not-master

According to man githooks "Before Git invokes a hook, it changes
its working directory to either the root of the working tree in a
non-bare repository, [...]". In this case "a" is non-bare and I
expected the command to be run in the working tree; but instead
it's run inside .git. (This caused some confusion in my case
because I ran "git merge" in the hook which put files in the .git
directory and I didn't notice it at first. I know running merge
in receive-hooks is "bad practice" but it works fine in my
setup.)

The same happens for all hooks executed by git-receive-pack:
pre-receive, update, post-receive, post-update.

Is this a documentation issue or unexpected behavior?

Regards
Simon
-- 
+ privacy is necessary
+ using gnupg http://gnupg.org
+ public key id: 0x92FEFDB7E44C32F9


signature.asc
Description: PGP signature


Re: [PATCH] push: document & test --force-with-lease with multiple remotes

2017-04-09 Thread Ævar Arnfjörð Bjarmason
On Sun, Apr 9, 2017 at 11:55 AM, Simon Ruderich  wrote:
> Hello,
>
> I like the documentation update and test.
>
> On Sat, Apr 08, 2017 at 11:41:00AM +, Ævar Arnfjörð Bjarmason wrote:
>> [snip]
>>
>> ++
>> +Now when the background process runs `git fetch origin` the references
>> +on `origin-push` won't be updated, and thus commands like:
>> ++
>> + git push --force-with-lease origin
>
> I think this should be origin-push.

Thanks! Well spotted. Will fix in v2 which I'll hold off on sending
for now pending more comments.

> Regards
> Simon
> --
> + privacy is necessary
> + using gnupg http://gnupg.org
> + public key id: 0x92FEFDB7E44C32F9


Re: Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-09 Thread Stefan Haller
Jacob Keller  wrote:

> Agreed. You "take" a lease whenever you push to the remote or when you
> pull from the remote and when you pull into the branch. It should
> store something that tracks both the branch and remote branch together
> so that you can generalize it to multiple remotes.

I don't see why it has to support multiple remotes (but then I don't
have much experience with workflows involving multiple remotes, so I may
well be missing something). A local branch can only have one remote
tracking branch on one remote, and in my view --force-with-lease without
arguments works with that remote tracking branch only. Is this view too
simple?

> It doesn't necessarily track perfectly with a branch that contains
> extra work such as when doing pull --rebase, but maybe you have an
> idea about that?

Maybe I wasn't clear enough about that in my proposal, but I propose to
always store the commit hash of the remote tracking branch as a new
lease after push and pull, not the local branch. This way it works
nicely with pull --rebase and a branch that has extra local commits.


-- 
Stefan Haller
Ableton
http://www.ableton.com/


Re: [PATCH] Make git log work for git CWD outside of work tree

2017-04-09 Thread Johannes Schindelin
Hi Danny,

On Sat, 8 Apr 2017, Danny Sauer wrote:

> Make git log's `--use-mailmap` argument works if the GIT_DIR &
> GIT_WORK_TREE env vars are set and git is run from outside of work tree.
> Without the NEED_WORK_TREE set on the log subcommand, .mailmap is
> silently not found.

A laudable goal. How about adding a test case, say, to t/t4203-mailmap.sh,
to ensure that Git won't regress on your fix?

> diff --git a/git.c b/git.c
> index 8ff44f0..e147f01 100644
> --- a/git.c
> +++ b/git.c
> @@ -440,7 +440,7 @@ static struct cmd_struct commands[] = {
>   { "init", cmd_init_db },
>   { "init-db", cmd_init_db },
>   { "interpret-trailers", cmd_interpret_trailers, RUN_SETUP_GENTLY },
> - { "log", cmd_log, RUN_SETUP },
> + { "log", cmd_log, RUN_SETUP | NEED_WORK_TREE },
>   { "ls-files", cmd_ls_files, RUN_SETUP | SUPPORT_SUPER_PREFIX },
>   { "ls-remote", cmd_ls_remote, RUN_SETUP_GENTLY },
>   { "ls-tree", cmd_ls_tree, RUN_SETUP },

This may work for you, but it does not work for me, as I often call `git
log` in a bare repository. And that call works, and it should keep
working.

Instead, I think, you need to figure out why the .mailmap file is not read
correctly when you use the GIT_DIR & GIT_WORK_TREE approach. My vague
hunch is that you need to replace the ".mailmap" in read_mailmap()
(defined in mailmap.c):

err |= read_mailmap_file(map, ".mailmap", repo_abbrev);

by something like mkpath("%s/%s", get_git_work_tree(), ".mailmap"), but
probably only after testing that we're not in a bare repository (which
would also fix a bug, I suspect, as `git log` in a bare repository
probably heeds a .mailmap file in the current directory, which is
incorrect). I.e. something like:

if (!is_bare_repository()) {
const char *path = mkpath("%s/%s",
  get_git_work_tree(), ".mailmap")
err |= read_mailmap_file(map, path, repo_abbrev);
}

But you really want to add the test case first, with
`test_expect_failure`, to demonstrate what is currently broken, and then
triumphantly setting it to `test_expect_success` with your patch.

Ciao,
Johannes


Re: [PATCH] push: document & test --force-with-lease with multiple remotes

2017-04-09 Thread Simon Ruderich
Hello,

I like the documentation update and test.

On Sat, Apr 08, 2017 at 11:41:00AM +, Ævar Arnfjörð Bjarmason wrote:
> [snip]
>
> ++
> +Now when the background process runs `git fetch origin` the references
> +on `origin-push` won't be updated, and thus commands like:
> ++
> + git push --force-with-lease origin

I think this should be origin-push.

Regards
Simon
-- 
+ privacy is necessary
+ using gnupg http://gnupg.org
+ public key id: 0x92FEFDB7E44C32F9


signature.asc
Description: PGP signature


Re: Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-09 Thread Stefan Haller
Jeff King  wrote:

> > It might be possible to generate these lease tags prior to operations
> > which modify history and then maybe having a way to list them so you
> > can select which one you meant when you try to use force-with-lease..
> 
> So yeah, I think that is the more interesting direction. I hadn't
> considered resolving the multiple-operation ambiguity at push time. But
> I guess it would be something like "you did a rebase on sha1 X at time
> T, and then one on Y at time T+N", and you pick which one you're
> expecting.

I think it's wrong to think about these leases as something that you
take before you start a rewindy operation. That's the wrong time to take
the lease; by that time, the remote tracking branch may already contain
new things that you haven't seen yet, so using that as a lease at that
time will overwrite those things later. You have to take the lease at a
time where you know that your local branch and the remote tracking
branch are up to date with each other, which is after pull and push. And
if you do that, there's no multiple-operation ambiguity to deal with at
all.

> And I think that may be converging on the "integrate" refs that Stefan is
> talking about elsewhere (or some isomorphism of it).

Does it make things clearer if we don't use the term "integrate", but
call the config value in my proposal simply "branch.*.lease"?


-- 
Stefan Haller
Ableton
http://www.ableton.com/


Re: Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-09 Thread Jacob Keller
On Sun, Apr 9, 2017 at 1:38 AM, Stefan Haller  wrote:
> Jeff King  wrote:
>
>> > It might be possible to generate these lease tags prior to operations
>> > which modify history and then maybe having a way to list them so you
>> > can select which one you meant when you try to use force-with-lease..
>>
>> So yeah, I think that is the more interesting direction. I hadn't
>> considered resolving the multiple-operation ambiguity at push time. But
>> I guess it would be something like "you did a rebase on sha1 X at time
>> T, and then one on Y at time T+N", and you pick which one you're
>> expecting.
>
> I think it's wrong to think about these leases as something that you
> take before you start a rewindy operation. That's the wrong time to take
> the lease; by that time, the remote tracking branch may already contain
> new things that you haven't seen yet, so using that as a lease at that
> time will overwrite those things later. You have to take the lease at a
> time where you know that your local branch and the remote tracking
> branch are up to date with each other, which is after pull and push. And
> if you do that, there's no multiple-operation ambiguity to deal with at
> all.

Agreed. You "take" a lease whenever you push to the remote or when you
pull from the remote and when you pull into the branch. It should
store something that tracks both the branch and remote branch together
so that you can generalize it to multiple remotes.

It doesn't necessarily track perfectly with a branch that contains
extra work such as when doing pull --rebase, but maybe you have an
idea about that?

Thanks,
Jake

>
>> And I think that may be converging on the "integrate" refs that Stefan is
>> talking about elsewhere (or some isomorphism of it).
>
> Does it make things clearer if we don't use the term "integrate", but
> call the config value in my proposal simply "branch.*.lease"?
>

Yes, I think so.

>
> --
> Stefan Haller
> Ableton
> http://www.ableton.com/


Re: Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-09 Thread Jacob Keller
On Sun, Apr 9, 2017 at 1:38 AM, Stefan Haller  wrote:
> Jacob Keller  wrote:
>
>> What if we added a separate command something like:
>>
>> git create-lease
>>
>> which you're expected to run at the start of a rewind-y operation and
>> it creates a tag (or some other ref like a tag but in a different
>> namespace) which is used by force-with-lease?
>
> The problem with this is that it doesn't help to use "git create-lease"
> right before you start your rewind-y operation, because by that time you
> may already have fetched. You'd have to use "git create-lease" right
> after you pull or push. But at the time I pull I don't know yet whether
> I will later want to rewrite the branch, so to be sure I have to do this
> every time I pull or push, and then I'd prefer git to do it for me.
>

No, you don't set the sha1 as the tip of "origin/master" you set it as
the tip of "master" after you've performed all the integration and are
about to rewind history somehow.

>> However, I think using origin/master works fine as long as you don't 
>> auto-fetch.
>>
>> If you're doing it right, you can handle origin/master updates by
>> checking that your rewind-y stuff is correct for the new origin/master
>> RIGHT before you push.
>
> I'm not sure I understand what you mean by "checking that your rewind-y
> stuff is correct for the new origin/master"; does that mean manually
> inspecting origin/master to convince youself that you are not
> overwriting something new? If so, I don't think this is acceptable. It
> is probably ok to work this way if the other party only pushed commits
> on top; it's reasonable to expect that you will recognize new commits as
> ones that you haven't seen before. But what if the other party has
> rewritten the branch and squashed improvements into commits in the
> middle of it? The head commit will then look the same as before, and the
> only way to tell whether you are overwriting something new is by
> comparing the old and new hashes. So then we're back at having to
> remember what the old hash was.
>

You can do a diff rather than a check of the log.

Thanks,
Jake

>
> --
> Stefan Haller
> Berlin, Germany
> http://www.haller-berlin.de/


Re: Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-09 Thread Stefan Haller
Jacob Keller  wrote:

> What if we added a separate command something like:
> 
> git create-lease
> 
> which you're expected to run at the start of a rewind-y operation and
> it creates a tag (or some other ref like a tag but in a different
> namespace) which is used by force-with-lease?

The problem with this is that it doesn't help to use "git create-lease"
right before you start your rewind-y operation, because by that time you
may already have fetched. You'd have to use "git create-lease" right
after you pull or push. But at the time I pull I don't know yet whether
I will later want to rewrite the branch, so to be sure I have to do this
every time I pull or push, and then I'd prefer git to do it for me.

> However, I think using origin/master works fine as long as you don't 
> auto-fetch.
>
> If you're doing it right, you can handle origin/master updates by
> checking that your rewind-y stuff is correct for the new origin/master
> RIGHT before you push.

I'm not sure I understand what you mean by "checking that your rewind-y
stuff is correct for the new origin/master"; does that mean manually
inspecting origin/master to convince youself that you are not
overwriting something new? If so, I don't think this is acceptable. It
is probably ok to work this way if the other party only pushed commits
on top; it's reasonable to expect that you will recognize new commits as
ones that you haven't seen before. But what if the other party has
rewritten the branch and squashed improvements into commits in the
middle of it? The head commit will then look the same as before, and the
only way to tell whether you are overwriting something new is by
comparing the old and new hashes. So then we're back at having to
remember what the old hash was.


-- 
Stefan Haller
Berlin, Germany
http://www.haller-berlin.de/


Re: why patch to the gitk no replay?

2017-04-09 Thread Florian Schüller
My patch submission (already the retry) "[PATCH 0/4] Gitk Inotify
support" is also pending since 8th of March.
I'm also not sure what I did wrong as I'm not used to those mailing
list programming - or if Paul is just very busy

On Fri, Mar 17, 2017 at 5:53 PM, Jonathan Nieder  wrote:
> Hi,
>
> yanke131415 wrote:
>
>> I send a patch to gitk
>> project(git://ozlabs.org/~paulus/gitk)with the target email address
>> pau...@ozlabs.org. But several days later No replay of this patch i
>> receive, is the  email address i send patch wrong? Anyone who knows?
>
> Sending to this mailing list (and cc-ing Paul) is more likely to work.
>
> Hope that helps,
> Jonathan