Re: [PATCHv4 06/14] run-command: add an asynchronous parallel child processor

2015-09-22 Thread Junio C Hamano
Stefan Beller  writes:

> +static void pp_buffer_stderr(struct parallel_processes *pp)
> +{
> + int i;
> +
> + while ((i = poll(pp->pfd, pp->max_processes, 100)) < 0) {
> + if (errno == EINTR)
> + continue;
> + pp_cleanup(pp);
> + die_errno("poll");
> + }
> +
> + /* Buffer output from all pipes. */
> + for (i = 0; i < pp->max_processes; i++) {
> + if (pp->children[i].in_use &&
> + pp->pfd[i].revents & POLLIN)
> + if (strbuf_read_once(&pp->children[i].err,
> +  pp->children[i].process.err, 0) < 
> 0)
> + if (errno != EAGAIN)
> + die_errno("read");
> + }
> +}

I think it is a good thing that the caller is passing the whole pp
to this function.  One thing you may want to consider is to adjust
the poll(2) timeout longer when the process slots are full.

There is nothing you can gain by returning early due to timeout
without doing anything from this function when you know you cannot
start a new process (here, I am assuming that your poll(2) would be
unblocked for a disconnect when one of the processes exits, letting
you return and letting the caller call collect_finished(), which in
turn would allow us to make some progress).

On the other hand, during the early ramp-up period, you may want to
use a shorter poll(2) timeout to give the caller a chance to spawn
more processes sooner.  But that falls into performance tuning that
can and should be left to a later follow-up patch after we get the
basic machinery right.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv4 06/14] run-command: add an asynchronous parallel child processor

2015-09-22 Thread Junio C Hamano
Stefan Beller  writes:

>  run-command.c  | 264 
> +
>  run-command.h  |  36 +++
>  t/t0061-run-command.sh |  20 
>  test-run-command.c |  24 +
>  4 files changed, 344 insertions(+)

I think we are almost there, but there were still a few more "huh?"
in this patch.

> +/* returns 1 if a process was started, 0 otherwise */

This claim is dubious.

The last four lines of this function marks the i-th slot as in_use,
and increments nr_processes, so that exit codepath clearly is about
"a process was started" and should be returning 1, but the code
returns 0, which looks incorrect.

> +static int pp_start_one(struct parallel_processes *pp)
> +{
> + int i;
> +
> + for (i = 0; i < pp->max_processes; i++)
> + if (!pp->children[i].in_use)
> + break;
> + if (i == pp->max_processes)
> + die("BUG: bookkeeping is hard");
> +
> + if (!pp->get_next_task(pp->data,
> +&pp->children[i].process,
> +&pp->children[i].err))
> + return 1;

And this one, when get_next_task() says "nothing more to do", is
clearly "we returned without starting anything", so according to the
comment it should be returning 0, but the code returns 1, which
looks incorrect.

> + if (start_command(&pp->children[i].process))
> + pp->start_failure(pp->data,
> +   &pp->children[i].process,
> +   &pp->children[i].err);

What should happen if start_failure returns without dying?
Shouldn't this function return something, without doing the
remainder of it?  i.e.

if (start_command(...)) {
pp->start_failur(...);
return SOMETHING;
}

According to the comment at the beginning, that SOMETHING ought to
be 0, because we did not spawn anything, i.e. we did not increment
nr_processes.

But don't make that change yet; I do not think it is a great
interface to say "did we or did we not add a new process?" anyway
(see below).

> + set_nonblocking(pp->children[i].process.err);
> +
> + pp->nr_processes++;
> + pp->children[i].in_use = 1;
> + pp->pfd[i].fd = pp->children[i].process.err;
> + return 0;

And this is "we spawned" that ought to have returned 1.

Perhaps the comment at the beginning is wrong, as the code
consistently does the opposite of what the comment says.

But it does not really matter, as I do not think this should be
returning "did we or did we not start a new process?" (see below).

> +}

> +int run_processes_parallel(int n, void *data,
> +get_next_task_fn get_next_task,
> +start_failure_fn start_failure,
> +return_value_fn return_value)
> +{
> + struct parallel_processes pp;
> + pp_init(&pp, n, data, get_next_task, start_failure, return_value);
> +
> + while (1) {
> + while (pp.nr_processes < pp.max_processes &&
> +!pp_start_one(&pp))
> + ; /* nothing */
> + if (!pp.nr_processes)
> + break;

This inner loop is why I think "did we or did we not spawn a new
process?" is not a great interface.

The reason why it is not a great interface is because there are two
possible reasons why pp_start_one() does not spawn a new process,
and this caller wants to behave differently depending on why it did
not spawn a new process.  They are:

 * get_next_task() truly ran out of things to do.

 * get_next_task() gave us a task, but it did not start, and
   start_failure was set not to die (e.g. the function can be used
   to tell the next_task machinery that it needs to return a
   replacement task for the one that failed to run.  That way, upon
   next call to get_next_task, a replacement task can run instead of
   the old one that failed).

For the former, we want to stop looping, for the latter, we
definitely do want to keep looping, as we want to make another call
to get_next_task() to grab the replacement task for the one that
just failed.

So I think it makes more sense to define the meaning of the return
value from pp_start_one() differently from the way this patch
defines.  "Return 0 when we truly ran out of things to do, otherwise
return non-zero", for example, would make more sense.  The return
value does not tell you if the call resulted in one more process,
but that is not any loss, as you can look at pp.nr_processes
yourself if you really cared.

With that, the above caller could be updated, with optional gradual
ramp_up, like so:

#define RAMP_UP_LIMIT 2

while (1) {
int ramp_up;
int no_more_task;

for (no_more_task = 0, ramp_up = RAMP_UP_LIMIT;
 !no_more_task && ramp_up && pp.nr_processes < 
pp.max_processes;
 ramp_up--)
if (!p

Re: [PATCHv3 02/13] xread: poll on non blocking fds

2015-09-22 Thread Jeff King
On Tue, Sep 22, 2015 at 05:14:42PM -0700, Stefan Beller wrote:

> We should not care if the call to poll failed, as we're in an infinite loop 
> and
> can only get out with the correct read(..). So maybe an implementation like 
> this
> would already suffice:
> 
> ssize_t xread(int fd, void *buf, size_t len)
> {
> ssize_t nr;
> if (len > MAX_IO_SIZE)
> len = MAX_IO_SIZE;
> while (1) {
> nr = read(fd, buf, len);
> if (nr < 0) {
> if (errno == EINTR)
> continue;
> if (errno == EAGAIN || errno == EWOULDBLOCK) {
> struct pollfd pfd;
> pfd.events = POLLIN;
> pfd.fd = fd;
> /* We deliberately ignore the return value of poll. */
> poll(&pfd, 1, -1);
> continue;
> }
> }
> return nr;
> }
> }

FWIW, that is what I had imagined.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv4 07/14] fetch_populated_submodules: use new parallel job processing

2015-09-22 Thread Stefan Beller
In a later patch we enable parallel processing of submodules, this
only adds the possibility for it. So this change should not change
any user facing behavior.

Signed-off-by: Stefan Beller 
---
 submodule.c | 119 +++-
 1 file changed, 85 insertions(+), 34 deletions(-)

diff --git a/submodule.c b/submodule.c
index 1d64e57..d7c7a6e 100644
--- a/submodule.c
+++ b/submodule.c
@@ -12,6 +12,7 @@
 #include "sha1-array.h"
 #include "argv-array.h"
 #include "blob.h"
+#include "thread-utils.h"
 
 static int config_fetch_recurse_submodules = RECURSE_SUBMODULES_ON_DEMAND;
 static struct string_list changed_submodule_paths;
@@ -615,37 +616,82 @@ static void calculate_changed_submodule_paths(void)
initialized_fetch_ref_tips = 0;
 }
 
+struct submodule_parallel_fetch {
+   int count;
+   struct argv_array args;
+   const char *work_tree;
+   const char *prefix;
+   int command_line_option;
+   int quiet;
+   int result;
+};
+#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0}
+
+int get_next_submodule(void *data, struct child_process *cp,
+  struct strbuf *err);
+
+void handle_submodule_fetch_start_err(void *data, struct child_process *cp, 
struct strbuf *err)
+{
+   struct submodule_parallel_fetch *spf = data;
+   spf->result = 1;
+}
+
+void handle_submodule_fetch_finish( void *data, struct child_process *cp, int 
retvalue)
+{
+   struct submodule_parallel_fetch *spf = data;
+
+   if (retvalue)
+   spf->result = 1;
+}
+
 int fetch_populated_submodules(const struct argv_array *options,
   const char *prefix, int command_line_option,
   int quiet)
 {
-   int i, result = 0;
-   struct child_process cp = CHILD_PROCESS_INIT;
-   struct argv_array argv = ARGV_ARRAY_INIT;
-   const char *work_tree = get_git_work_tree();
-   if (!work_tree)
+   int i;
+   int max_parallel_jobs = 1;
+   struct submodule_parallel_fetch spf = SPF_INIT;
+
+   spf.work_tree = get_git_work_tree();
+   spf.command_line_option = command_line_option;
+   spf.quiet = quiet;
+   spf.prefix = prefix;
+
+   if (!spf.work_tree)
goto out;
 
if (read_cache() < 0)
die("index file corrupt");
 
-   argv_array_push(&argv, "fetch");
+   argv_array_push(&spf.args, "fetch");
for (i = 0; i < options->argc; i++)
-   argv_array_push(&argv, options->argv[i]);
-   argv_array_push(&argv, "--recurse-submodules-default");
+   argv_array_push(&spf.args, options->argv[i]);
+   argv_array_push(&spf.args, "--recurse-submodules-default");
/* default value, "--submodule-prefix" and its value are added later */
 
-   cp.env = local_repo_env;
-   cp.git_cmd = 1;
-   cp.no_stdin = 1;
-
calculate_changed_submodule_paths();
+   run_processes_parallel(max_parallel_jobs, &spf,
+  get_next_submodule,
+  handle_submodule_fetch_start_err,
+  handle_submodule_fetch_finish);
+
+   argv_array_clear(&spf.args);
+out:
+   string_list_clear(&changed_submodule_paths, 1);
+   return spf.result;
+}
+
+int get_next_submodule(void *data, struct child_process *cp,
+  struct strbuf *err)
+{
+   int ret = 0;
+   struct submodule_parallel_fetch *spf = data;
 
-   for (i = 0; i < active_nr; i++) {
+   for ( ; spf->count < active_nr; spf->count++) {
struct strbuf submodule_path = STRBUF_INIT;
struct strbuf submodule_git_dir = STRBUF_INIT;
struct strbuf submodule_prefix = STRBUF_INIT;
-   const struct cache_entry *ce = active_cache[i];
+   const struct cache_entry *ce = active_cache[spf->count];
const char *git_dir, *default_argv;
const struct submodule *submodule;
 
@@ -657,7 +703,7 @@ int fetch_populated_submodules(const struct argv_array 
*options,
submodule = submodule_from_name(null_sha1, ce->name);
 
default_argv = "yes";
-   if (command_line_option == RECURSE_SUBMODULES_DEFAULT) {
+   if (spf->command_line_option == RECURSE_SUBMODULES_DEFAULT) {
if (submodule &&
submodule->fetch_recurse !=
RECURSE_SUBMODULES_NONE) {
@@ -680,40 +726,45 @@ int fetch_populated_submodules(const struct argv_array 
*options,
default_argv = "on-demand";
}
}
-   } else if (command_line_option == RECURSE_SUBMODULES_ON_DEMAND) 
{
+   } else if (spf->command_line_option == 
RECURSE_SUBMODULES_ON_DEMAND) {
if 
(!unsorted_string

[PATCHv4 13/14] git submodule update: cmd_update_fetch

2015-09-22 Thread Stefan Beller
Split the fetching part out to its own function,
this allow us in a later patch to convert cmd_update in C.

Signed-off-by: Stefan Beller 
---
 git-submodule.sh | 164 ---
 1 file changed, 84 insertions(+), 80 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 7f11158..a1bc8d5 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -625,6 +625,89 @@ cmd_update_clone()
cmd_update_recursive
 }
 
+cmd_update_fetch()
+{
+   subsha1=$(clear_local_git_env; cd "$sm_path" &&
+   git rev-parse --verify HEAD) ||
+   die "$(eval_gettext "Unable to find current revision in submodule path 
'\$displaypath'")"
+
+   if test -n "$remote"
+   then
+   if test -z "$nofetch"
+   then
+   # Fetch remote before determining tracking $sha1
+   (clear_local_git_env; cd "$sm_path" && git-fetch) ||
+   die "$(eval_gettext "Unable to fetch in submodule path 
'\$sm_path'")"
+   fi
+   remote_name=$(clear_local_git_env; cd "$sm_path" && 
get_default_remote)
+   sha1=$(clear_local_git_env; cd "$sm_path" &&
+   git rev-parse --verify "${remote_name}/${branch}") ||
+   die "$(eval_gettext "Unable to find current 
${remote_name}/${branch} revision in submodule path '\$sm_path'")"
+   fi
+
+   if test "$subsha1" != "$sha1" || test -n "$force"
+   then
+   subforce=$force
+   # If we don't already have a -f flag and the submodule has 
never been checked out
+   if test -z "$subsha1" && test -z "$force"
+   then
+   subforce="-f"
+   fi
+
+   if test -z "$nofetch"
+   then
+   # Run fetch only if $sha1 isn't present or it
+   # is not reachable from a ref.
+   (clear_local_git_env; cd "$sm_path" &&
+   ( (rev=$(git rev-list -n 1 $sha1 --not --all 
2>/dev/null) &&
+test -z "$rev") || git-fetch)) ||
+   die "$(eval_gettext "Unable to fetch in submodule path 
'\$displaypath'")"
+   fi
+
+   must_die_on_failure=
+   case "$update_module" in
+   checkout)
+   command="git checkout $subforce -q"
+   die_msg="$(eval_gettext "Unable to checkout '\$sha1' in 
submodule path '\$displaypath'")"
+   say_msg="$(eval_gettext "Submodule path 
'\$displaypath': checked out '\$sha1'")"
+   ;;
+   rebase)
+   command="git rebase"
+   die_msg="$(eval_gettext "Unable to rebase '\$sha1' in 
submodule path '\$displaypath'")"
+   say_msg="$(eval_gettext "Submodule path 
'\$displaypath': rebased into '\$sha1'")"
+   must_die_on_failure=yes
+   ;;
+   merge)
+   command="git merge"
+   die_msg="$(eval_gettext "Unable to merge '\$sha1' in 
submodule path '\$displaypath'")"
+   say_msg="$(eval_gettext "Submodule path 
'\$displaypath': merged in '\$sha1'")"
+   must_die_on_failure=yes
+   ;;
+   !*)
+   command="${update_module#!}"
+   die_msg="$(eval_gettext "Execution of '\$command 
\$sha1' failed in submodule  path '\$prefix\$sm_path'")"
+   say_msg="$(eval_gettext "Submodule path 
'\$prefix\$sm_path': '\$command \$sha1'")"
+   must_die_on_failure=yes
+   ;;
+   *)
+   die "$(eval_gettext "Invalid update mode 
'$update_module' for submodule '$name'")"
+   esac
+
+   if (clear_local_git_env; cd "$sm_path" && $command "$sha1")
+   then
+   say "$say_msg"
+   elif test -n "$must_die_on_failure"
+   then
+   die_with_status 2 "$die_msg"
+   else
+   err="${err};$die_msg"
+   return
+   fi
+   fi
+
+   cmd_update_recursive
+}
+
 #
 # Update each submodule path to correct revision, using clone and checkout as 
needed
 #
@@ -743,88 +826,9 @@ Maybe you want to use 'update --init'?")"
if ! test -d "$sm_path"/.git && ! test -f "$sm_path"/.git
then
cmd_update_clone
-   continue
else
-   subsha1=$(clear_local_git_env; cd "$sm_path" &&
-   git rev-parse --verify HEAD) ||
-   die "$(eval_gettext "Unable to find current revision in 
submodule path '\$displaypath'")"
+

[PATCHv4 05/14] run-command: factor out return value computation

2015-09-22 Thread Stefan Beller
We will need computing the return value in a later patch without the
wait.

Signed-off-by: Stefan Beller 
---
 run-command.c | 56 ++--
 1 file changed, 34 insertions(+), 22 deletions(-)

diff --git a/run-command.c b/run-command.c
index 28e1d55..2bf1aba 100644
--- a/run-command.c
+++ b/run-command.c
@@ -232,6 +232,35 @@ static inline void set_cloexec(int fd)
fcntl(fd, F_SETFD, flags | FD_CLOEXEC);
 }
 
+static int determine_return_value(int wait_status,
+ int *result,
+ int *error_code,
+ const char *argv0)
+{
+   if (WIFSIGNALED(wait_status)) {
+   *result = WTERMSIG(wait_status);
+   if (*result != SIGINT && *result != SIGQUIT)
+   error("%s died of signal %d", argv0, *result);
+   /*
+* This return value is chosen so that code & 0xff
+* mimics the exit code that a POSIX shell would report for
+* a program that died from this signal.
+*/
+   *result += 128;
+   } else if (WIFEXITED(wait_status)) {
+   *result = WEXITSTATUS(wait_status);
+   /*
+* Convert special exit code when execvp failed.
+*/
+   if (*result == 127) {
+   *result = -1;
+   *error_code = ENOENT;
+   }
+   } else
+   return -1;
+   return 0;
+}
+
 static int wait_or_whine(pid_t pid, const char *argv0)
 {
int status, code = -1;
@@ -244,29 +273,12 @@ static int wait_or_whine(pid_t pid, const char *argv0)
if (waiting < 0) {
failed_errno = errno;
error("waitpid for %s failed: %s", argv0, strerror(errno));
-   } else if (waiting != pid) {
-   error("waitpid is confused (%s)", argv0);
-   } else if (WIFSIGNALED(status)) {
-   code = WTERMSIG(status);
-   if (code != SIGINT && code != SIGQUIT)
-   error("%s died of signal %d", argv0, code);
-   /*
-* This return value is chosen so that code & 0xff
-* mimics the exit code that a POSIX shell would report for
-* a program that died from this signal.
-*/
-   code += 128;
-   } else if (WIFEXITED(status)) {
-   code = WEXITSTATUS(status);
-   /*
-* Convert special exit code when execvp failed.
-*/
-   if (code == 127) {
-   code = -1;
-   failed_errno = ENOENT;
-   }
} else {
-   error("waitpid is confused (%s)", argv0);
+   if (waiting != pid || (determine_return_value(status,
+ &code,
+ &failed_errno,
+ argv0) < 0))
+   error("waitpid is confused (%s)", argv0);
}
 
clear_child_for_cleanup(pid);
-- 
2.5.0.272.ga84127c.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv4 10/14] submodule config: keep update strategy around

2015-09-22 Thread Stefan Beller
We need the submodule update strategies in a later patch.

Signed-off-by: Stefan Beller 
---
 submodule-config.c | 11 +++
 submodule-config.h |  1 +
 2 files changed, 12 insertions(+)

diff --git a/submodule-config.c b/submodule-config.c
index afe0ea8..8b8c7d1 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -194,6 +194,7 @@ static struct submodule *lookup_or_create_by_name(struct 
submodule_cache *cache,
 
submodule->path = NULL;
submodule->url = NULL;
+   submodule->update = NULL;
submodule->fetch_recurse = RECURSE_SUBMODULES_NONE;
submodule->ignore = NULL;
 
@@ -311,6 +312,16 @@ static int parse_config(const char *var, const char 
*value, void *data)
free((void *) submodule->url);
submodule->url = xstrdup(value);
}
+   } else if (!strcmp(item.buf, "update")) {
+   if (!value)
+   ret = config_error_nonbool(var);
+   else if (!me->overwrite && submodule->update != NULL)
+   warn_multiple_config(me->commit_sha1, submodule->name,
+"update");
+   else {
+   free((void *)submodule->update);
+   submodule->update = xstrdup(value);
+   }
}
 
strbuf_release(&name);
diff --git a/submodule-config.h b/submodule-config.h
index 9061e4e..f9e2a29 100644
--- a/submodule-config.h
+++ b/submodule-config.h
@@ -14,6 +14,7 @@ struct submodule {
const char *url;
int fetch_recurse;
const char *ignore;
+   const char *update;
/* the sha1 blob id of the responsible .gitmodules file */
unsigned char gitmodules_sha1[20];
 };
-- 
2.5.0.272.ga84127c.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv4 11/14] git submodule update: cmd_update_recursive

2015-09-22 Thread Stefan Beller
Split the recursion part out to its own function, this allow us
in a later patch to convert cmd_update in C.

Signed-off-by: Stefan Beller 
---
 git-submodule.sh | 47 ++-
 1 file changed, 26 insertions(+), 21 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 8b0eb9a..ea3260e 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -582,6 +582,31 @@ cmd_deinit()
done
 }
 
+
+cmd_update_recursive()
+{
+   if test -n "$recursive"
+   then
+   (
+   prefix="$prefix$sm_path/"
+   clear_local_git_env
+   cd "$sm_path" &&
+   eval cmd_update
+   )
+   res=$?
+   if test $res -gt 0
+   then
+   die_msg="$(eval_gettext "Failed to recurse into 
submodule path '\$displaypath'")"
+   if test $res -eq 1
+   then
+   err="${err};$die_msg"
+   else
+   die_with_status $res "$die_msg"
+   fi
+   fi
+   fi
+}
+
 #
 # Update each submodule path to correct revision, using clone and checkout as 
needed
 #
@@ -790,27 +815,7 @@ Maybe you want to use 'update --init'?")"
fi
fi
 
-   if test -n "$recursive"
-   then
-   (
-   prefix="$prefix$sm_path/"
-   clear_local_git_env
-   cd "$sm_path" &&
-   eval cmd_update
-   )
-   res=$?
-   if test $res -gt 0
-   then
-   die_msg="$(eval_gettext "Failed to recurse into 
submodule path '\$displaypath'")"
-   if test $res -eq 1
-   then
-   err="${err};$die_msg"
-   continue
-   else
-   die_with_status $res "$die_msg"
-   fi
-   fi
-   fi
+   cmd_update_recursive
done
 
if test -n "$err"
-- 
2.5.0.272.ga84127c.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv4 14/14] Rewrite submodule update in C

2015-09-22 Thread Stefan Beller
This will make parallelisation easier in a followup patch. This is just
a translation from shell to C, hopefully not introducing any bugs.

Signed-off-by: Stefan Beller 
---
 builtin/submodule--helper.c | 251 
 git-submodule.sh| 135 +---
 2 files changed, 254 insertions(+), 132 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index f4c3eff..b79117a 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -255,6 +255,256 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
return 0;
 }
 
+struct module_update_data {
+   struct module_list list;
+   int count;
+
+   struct pathspec pathspec;
+
+   int no_fetch;
+   int force;
+   int update;
+   int quiet;
+   int recursive;
+   int remote;
+   char *prefix;
+   char *reference;
+   char *depth;
+
+   struct argv_array args;
+
+   int result;
+};
+
+static void module_update_data_init(struct module_update_data *mud)
+{
+   mud->list.entries = NULL;
+   mud->list.alloc = 0;
+   mud->list.nr = 0;
+
+   memset(&mud->pathspec, 0, sizeof(mud->pathspec));
+
+   mud->count = 0;
+   mud->no_fetch = 0;
+   mud->force = 0;
+   mud->update = 0;
+   mud->quiet = 0;
+   mud->remote = 0;
+   mud->recursive = 0;
+   mud->result = 0;
+
+   mud->prefix = NULL;
+   mud->reference = NULL;
+   mud->depth = NULL;
+
+   argv_array_init(&mud->args);
+}
+
+static int update_next_task(void *data,
+struct child_process *cp,
+struct strbuf *err)
+{
+   int i;
+   struct module_update_data *mud = data;
+   struct strbuf sb = STRBUF_INIT;
+   const char *displaypath;
+
+   for (; mud->count < mud->list.nr; mud->count++) {
+   const char *update_module;
+   const char *sm_gitdir;
+   const struct submodule *sub;
+   const struct cache_entry *ce = mud->list.entries[mud->count];
+
+   displaypath = relative_path(ce->name, mud->prefix, &sb);
+   strbuf_reset(&sb);
+
+   if (ce_stage(ce)) {
+   strbuf_addf(err, "Skipping unmerged submodule %s",
+   displaypath);
+   continue;
+   }
+
+   sub = submodule_from_path(null_sha1, ce->name);
+   if (!sub) {
+   mud->result = 1;
+   return 0;
+   }
+
+   switch (mud->update) {
+   case 'r':
+   update_module = "rebase";
+   break;
+   case 'c':
+   update_module = "checkout";
+   break;
+   case 'm':
+   update_module = "merge";
+   break;
+   case 0:
+   /* not specified by command line */
+   if (sub->update)
+   update_module = sub->update;
+   else
+   update_module = "checkout";
+   break;
+   default:
+   die("BUG: update mode not covered");
+   }
+
+   if (!strcmp(update_module, "none")) {
+   strbuf_addf(err, "Skipping submodule '%s'", 
displaypath);
+   continue;
+   }
+
+   if (!sub->url) {
+   /*
+* Only mention uninitialized submodules when its
+* path have been specified
+*/
+   if (!mud->pathspec.nr)
+   continue;
+
+   strbuf_addf(err,
+   _("Submodule path '%s' not initialized \n"
+   "Maybe you want to use 'update --init'?"),
+   displaypath);
+   continue;
+   }
+
+   strbuf_addf(&sb, "%s/.git", ce->name);
+   sm_gitdir = strbuf_detach(&sb, NULL);
+
+   child_process_init(cp);
+   for (i = 0; local_repo_env[i]; i++)
+   argv_array_pushf(&cp->env_array, "%s", 
local_repo_env[i]);
+
+   argv_array_pushf(&cp->env_array, "displaypath=%s", displaypath);
+   argv_array_pushf(&cp->env_array, "sm_path=%s", sub->path);
+   argv_array_pushf(&cp->env_array, "name=%s", sub->name);
+   argv_array_pushf(&cp->env_array, "url=%s", sub->url);
+   argv_array_pushf(&cp->env_array, "sha1=%s", 
sha1_to_hex(ce->sha1));
+   argv_array_pushf(&cp->env_array, "update_module=%s", 
update_module);
+
+   cp->git_cmd = 1;
+ 

[PATCHv4 06/14] run-command: add an asynchronous parallel child processor

2015-09-22 Thread Stefan Beller
This allows to run external commands in parallel with ordered output
on stderr.

If we run external commands in parallel we cannot pipe the output directly
to the our stdout/err as it would mix up. So each process's output will
flow through a pipe, which we buffer. One subprocess can be directly
piped to out stdout/err for a low latency feedback to the user.

Example:
Let's assume we have 5 submodules A,B,C,D,E and each fetch takes a
different amount of time as the different submodules vary in size, then
the output of fetches in sequential order might look like this:

 time -->
 output: |---A---| |-B-| |---C---| |-D-| |-E-|

When we schedule these submodules into maximal two parallel processes,
a schedule and sample output over time may look like this:

process 1: |---A---| |-D-| |-E-|

process 2: |-B-| |---C---|

output:|---A---|B|---C---|DE

So A will be perceived as it would run normally in the single child
version. As B has finished by the time A is done, we can dump its whole
progress buffer on stderr, such that it looks like it finished in no
time. Once that is done, C is determined to be the visible child and
its progress will be reported in real time.

So this way of output is really good for human consumption, as it only
changes the timing, not the actual output.

For machine consumption the output needs to be prepared in the tasks,
by either having a prefix per line or per block to indicate whose tasks
output is displayed, because the output order may not follow the
original sequential ordering:

 |A| |--B--| |-C-|

will be scheduled to be all parallel:

process 1: |A|
process 2: |--B--|
process 3: |-C-|
output:|A|CB

This happens because C finished before B did, so it will be queued for
output before B.

Signed-off-by: Stefan Beller 
---
 run-command.c  | 264 +
 run-command.h  |  36 +++
 t/t0061-run-command.sh |  20 
 test-run-command.c |  24 +
 4 files changed, 344 insertions(+)

diff --git a/run-command.c b/run-command.c
index 2bf1aba..494e1f8 100644
--- a/run-command.c
+++ b/run-command.c
@@ -3,6 +3,8 @@
 #include "exec_cmd.h"
 #include "sigchain.h"
 #include "argv-array.h"
+#include "thread-utils.h"
+#include "strbuf.h"
 
 void child_process_init(struct child_process *child)
 {
@@ -864,3 +866,265 @@ int capture_command(struct child_process *cmd, struct 
strbuf *buf, size_t hint)
close(cmd->out);
return finish_command(cmd);
 }
+
+struct parallel_processes {
+   void *data;
+
+   int max_processes;
+   int nr_processes;
+
+   get_next_task_fn get_next_task;
+   start_failure_fn start_failure;
+   return_value_fn return_value;
+
+   struct {
+   unsigned in_use : 1;
+   struct child_process process;
+   struct strbuf err;
+   } *children;
+   /*
+* The struct pollfd is logically part of *children,
+* but the system call expects it as its own array.
+*/
+   struct pollfd *pfd;
+
+   int output_owner;
+   struct strbuf buffered_output; /* of finished children */
+};
+
+void default_start_failure(void *data,
+  struct child_process *cp,
+  struct strbuf *err)
+{
+   int i;
+   struct strbuf sb = STRBUF_INIT;
+
+   for (i = 0; cp->argv[i]; i++)
+   strbuf_addf(&sb, " %s", cp->argv[i]);
+
+   die_errno("Starting a child failed:%s", sb.buf);
+}
+
+void default_return_value(void *data,
+ struct child_process *cp,
+ int result)
+{
+   int i;
+   struct strbuf sb = STRBUF_INIT;
+
+   if (!result)
+   return;
+
+   for (i = 0; cp->argv[i]; i++)
+   strbuf_addf(&sb, " %s", cp->argv[i]);
+
+   die_errno("A child failed with return code %d:%s", result, sb.buf);
+}
+
+static void pp_init(struct parallel_processes *pp,
+   int n, void *data,
+   get_next_task_fn get_next_task,
+   start_failure_fn start_failure,
+   return_value_fn return_value)
+{
+   int i;
+
+   if (n < 1)
+   n = online_cpus();
+
+   pp->max_processes = n;
+   pp->data = data;
+   if (!get_next_task)
+   die("BUG: you need to specify a get_next_task function");
+   pp->get_next_task = get_next_task;
+
+   pp->start_failure = start_failure ? start_failure : 
default_start_failure;
+   pp->return_value = return_value ? return_value : default_return_value;
+
+   pp->nr_processes = 0;
+   pp->output_owner = 0;
+   pp->children = xcalloc(n, sizeof(*pp->children));
+   pp->pfd = xcalloc(n, sizeof(*pp->pfd));
+   strbuf_init(&pp->buffered_output, 0);
+
+   for (i = 0; i < n; i++) {
+   strbuf_init(

[PATCHv4 09/14] submodule-config: Untangle logic in parse_config

2015-09-22 Thread Stefan Beller
CC: Eric Sunshine 
Signed-off-by: Stefan Beller 
---
 submodule-config.c | 74 +-
 1 file changed, 29 insertions(+), 45 deletions(-)

diff --git a/submodule-config.c b/submodule-config.c
index 393de53..afe0ea8 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -257,78 +257,62 @@ static int parse_config(const char *var, const char 
*value, void *data)
if (!name_and_item_from_var(var, &name, &item))
return 0;
 
-   submodule = lookup_or_create_by_name(me->cache, me->gitmodules_sha1,
-   name.buf);
+   submodule = lookup_or_create_by_name(me->cache,
+me->gitmodules_sha1,
+name.buf);
 
if (!strcmp(item.buf, "path")) {
-   struct strbuf path = STRBUF_INIT;
-   if (!value) {
+   if (!value)
ret = config_error_nonbool(var);
-   goto release_return;
-   }
-   if (!me->overwrite && submodule->path != NULL) {
+   else if (!me->overwrite && submodule->path != NULL)
warn_multiple_config(me->commit_sha1, submodule->name,
"path");
-   goto release_return;
+   else {
+   if (submodule->path)
+   cache_remove_path(me->cache, submodule);
+   free((void *) submodule->path);
+   submodule->path = xstrdup(value);
+   cache_put_path(me->cache, submodule);
}
-
-   if (submodule->path)
-   cache_remove_path(me->cache, submodule);
-   free((void *) submodule->path);
-   strbuf_addstr(&path, value);
-   submodule->path = strbuf_detach(&path, NULL);
-   cache_put_path(me->cache, submodule);
} else if (!strcmp(item.buf, "fetchrecursesubmodules")) {
/* when parsing worktree configurations we can die early */
int die_on_error = is_null_sha1(me->gitmodules_sha1);
if (!me->overwrite &&
-   submodule->fetch_recurse != RECURSE_SUBMODULES_NONE) {
+   submodule->fetch_recurse != RECURSE_SUBMODULES_NONE)
warn_multiple_config(me->commit_sha1, submodule->name,
"fetchrecursesubmodules");
-   goto release_return;
-   }
-
-   submodule->fetch_recurse = parse_fetch_recurse(var, value,
+   else
+   submodule->fetch_recurse = parse_fetch_recurse(
+   var, value,
die_on_error);
} else if (!strcmp(item.buf, "ignore")) {
-   struct strbuf ignore = STRBUF_INIT;
-   if (!me->overwrite && submodule->ignore != NULL) {
+   if (!value)
+   ret = config_error_nonbool(var);
+   else if (!me->overwrite && submodule->ignore != NULL)
warn_multiple_config(me->commit_sha1, submodule->name,
"ignore");
-   goto release_return;
-   }
-   if (!value) {
-   ret = config_error_nonbool(var);
-   goto release_return;
-   }
-   if (strcmp(value, "untracked") && strcmp(value, "dirty") &&
-   strcmp(value, "all") && strcmp(value, "none")) {
+   else if (strcmp(value, "untracked") &&
+strcmp(value, "dirty") &&
+strcmp(value, "all") &&
+strcmp(value, "none"))
warning("Invalid parameter '%s' for config option "
"'submodule.%s.ignore'", value, var);
-   goto release_return;
+   else {
+   free((void *) submodule->ignore);
+   submodule->ignore = xstrdup(value);
}
-
-   free((void *) submodule->ignore);
-   strbuf_addstr(&ignore, value);
-   submodule->ignore = strbuf_detach(&ignore, NULL);
} else if (!strcmp(item.buf, "url")) {
-   struct strbuf url = STRBUF_INIT;
if (!value) {
ret = config_error_nonbool(var);
-   goto release_return;
-   }
-   if (!me->overwrite && submodule->url != NULL) {
+   } else if (!me->overwrite && submodule->url != NULL) {
warn_multiple_config(me->commit_sha1, submodule->name,
"url");
-

[PATCHv4 02/14] xread: poll on non blocking fds

2015-09-22 Thread Stefan Beller
>From the man page:
EAGAIN The file descriptor fd refers to a file other than a socket
   and has been marked nonblocking (O_NONBLOCK), and the read
   would block.

EAGAIN or EWOULDBLOCK
   The file descriptor fd refers to a socket and has been marked
   nonblocking (O_NONBLOCK), and the read would block.  POSIX.1-2001
   allows either error to be returned for this case, and does not
   require these constants to have the same value, so a portable
   application should check for both possibilities.

If we get an EAGAIN or EWOULDBLOCK the fd must have set O_NONBLOCK.
As the intent of xread is to read as much as possible either until the
fd is EOF or an actual error occurs, we can ease the feeder of the fd
by not spinning the whole time, but rather wait for it politely by not
busy waiting.

We should not care if the call to poll failed, as we're in an infinite
loop and can only get out with the correct read().

Signed-off-by: Stefan Beller 
---
 wrapper.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/wrapper.c b/wrapper.c
index ff49807..5517928 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -201,8 +201,17 @@ ssize_t xread(int fd, void *buf, size_t len)
len = MAX_IO_SIZE;
while (1) {
nr = read(fd, buf, len);
-   if ((nr < 0) && (errno == EAGAIN || errno == EINTR))
-   continue;
+   if (nr < 0) {
+   if (errno == EINTR)
+   continue;
+   if (errno == EAGAIN || errno == EWOULDBLOCK) {
+   struct pollfd pfd;
+   pfd.events = POLLIN;
+   pfd.fd = fd;
+   /* We deliberately ignore the return value */
+   poll(&pfd, 1, -1);
+   }
+   }
return nr;
}
 }
-- 
2.5.0.272.ga84127c.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv4 08/14] submodules: allow parallel fetching, add tests and documentation

2015-09-22 Thread Stefan Beller
This enables the work of the previous patches.

Signed-off-by: Stefan Beller 
---
 Documentation/fetch-options.txt |  7 +++
 builtin/fetch.c |  6 +-
 builtin/pull.c  |  6 ++
 submodule.c |  3 +--
 submodule.h |  2 +-
 t/t5526-fetch-submodules.sh | 19 +++
 6 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
index 45583d8..6b109f6 100644
--- a/Documentation/fetch-options.txt
+++ b/Documentation/fetch-options.txt
@@ -100,6 +100,13 @@ ifndef::git-pull[]
reference to a commit that isn't already in the local submodule
clone.
 
+-j::
+--jobs=::
+   Number of parallel children to be used for fetching submodules.
+   Each will fetch from different submodules, such that fetching many
+   submodules will be faster. By default submodules will be fetched
+   one at a time.
+
 --no-recurse-submodules::
Disable recursive fetching of submodules (this has the same effect as
using the '--recurse-submodules=no' option).
diff --git a/builtin/fetch.c b/builtin/fetch.c
index ee1f1a9..f28eac6 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -37,6 +37,7 @@ static int prune = -1; /* unspecified */
 static int all, append, dry_run, force, keep, multiple, update_head_ok, 
verbosity;
 static int progress = -1, recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
 static int tags = TAGS_DEFAULT, unshallow, update_shallow;
+static int max_children = 1;
 static const char *depth;
 static const char *upload_pack;
 static struct strbuf default_rla = STRBUF_INIT;
@@ -99,6 +100,8 @@ static struct option builtin_fetch_options[] = {
N_("fetch all tags and associated objects"), TAGS_SET),
OPT_SET_INT('n', NULL, &tags,
N_("do not fetch all tags (--no-tags)"), TAGS_UNSET),
+   OPT_INTEGER('j', "jobs", &max_children,
+   N_("number of submodules fetched in parallel")),
OPT_BOOL('p', "prune", &prune,
 N_("prune remote-tracking branches no longer on remote")),
{ OPTION_CALLBACK, 0, "recurse-submodules", NULL, N_("on-demand"),
@@ -1217,7 +1220,8 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
result = fetch_populated_submodules(&options,
submodule_prefix,
recurse_submodules,
-   verbosity < 0);
+   verbosity < 0,
+   max_children);
argv_array_clear(&options);
}
 
diff --git a/builtin/pull.c b/builtin/pull.c
index 722a83c..f0af196 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -94,6 +94,7 @@ static int opt_force;
 static char *opt_tags;
 static char *opt_prune;
 static char *opt_recurse_submodules;
+static char *max_children;
 static int opt_dry_run;
 static char *opt_keep;
 static char *opt_depth;
@@ -177,6 +178,9 @@ static struct option pull_options[] = {
N_("on-demand"),
N_("control recursive fetching of submodules"),
PARSE_OPT_OPTARG),
+   OPT_PASSTHRU('j', "jobs", &max_children, N_("n"),
+   N_("number of submodules pulled in parallel"),
+   PARSE_OPT_OPTARG),
OPT_BOOL(0, "dry-run", &opt_dry_run,
N_("dry run")),
OPT_PASSTHRU('k', "keep", &opt_keep, NULL,
@@ -524,6 +528,8 @@ static int run_fetch(const char *repo, const char 
**refspecs)
argv_array_push(&args, opt_prune);
if (opt_recurse_submodules)
argv_array_push(&args, opt_recurse_submodules);
+   if (max_children)
+   argv_array_push(&args, max_children);
if (opt_dry_run)
argv_array_push(&args, "--dry-run");
if (opt_keep)
diff --git a/submodule.c b/submodule.c
index d7c7a6e..fdaf3e4 100644
--- a/submodule.c
+++ b/submodule.c
@@ -646,10 +646,9 @@ void handle_submodule_fetch_finish( void *data, struct 
child_process *cp, int re
 
 int fetch_populated_submodules(const struct argv_array *options,
   const char *prefix, int command_line_option,
-  int quiet)
+  int quiet, int max_parallel_jobs)
 {
int i;
-   int max_parallel_jobs = 1;
struct submodule_parallel_fetch spf = SPF_INIT;
 
spf.work_tree = get_git_work_tree();
diff --git a/submodule.h b/submodule.h
index 5507c3d..cbc0003 100644
--- a/submodule.h
+++ b/submodule.h
@@ -31,7 +31,7 @@ void set_config_fetch_recurse_submodules(int value);
 void check_for_new_submodule_commits(unsigned char new_sha1[20]);
 int fetch_populated_submodules(const struct argv_array *options,
   const char *

[PATCHv4 12/14] git submodule update: cmd_update_clone

2015-09-22 Thread Stefan Beller
Split the cloning part out to its own function,
this allow us in a later patch to convert cmd_update in C.

Signed-off-by: Stefan Beller 
---
 git-submodule.sh | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index ea3260e..7f11158 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -607,6 +607,24 @@ cmd_update_recursive()
fi
 }
 
+cmd_update_clone()
+{
+   command="git checkout $subforce -q"
+   die_msg="$(eval_gettext "Unable to checkout '\$sha1' in submodule path 
'\$displaypath'")"
+   say_msg="$(eval_gettext "Submodule path '\$displaypath': checked out 
'\$sha1'")"
+
+   git submodule--helper clone ${GIT_QUIET:+--quiet} ${prefix:+--prefix 
"$prefix"} --path "$sm_path" --name "$name" --url "$url" "$reference" "$depth" 
|| exit
+
+   if (clear_local_git_env; cd "$sm_path" && $command "$sha1")
+   then
+   say "$say_msg"
+   else
+   err="${err};$die_msg"
+   return
+   fi
+   cmd_update_recursive
+}
+
 #
 # Update each submodule path to correct revision, using clone and checkout as 
needed
 #
@@ -680,7 +698,6 @@ cmd_update()
cmd_init "--" "$@" || return
fi
 
-   cloned_modules=
git submodule--helper list --prefix "$wt_prefix" "$@" | {
err=
while read mode sha1 stage sm_path
@@ -725,9 +742,8 @@ Maybe you want to use 'update --init'?")"
 
if ! test -d "$sm_path"/.git && ! test -f "$sm_path"/.git
then
-   git submodule--helper clone ${GIT_QUIET:+--quiet} 
--prefix "$prefix" --path "$sm_path" --name "$name" --url "$url" "$reference" 
"$depth" || exit
-   cloned_modules="$cloned_modules;$name"
-   subsha1=
+   cmd_update_clone
+   continue
else
subsha1=$(clear_local_git_env; cd "$sm_path" &&
git rev-parse --verify HEAD) ||
@@ -767,13 +783,6 @@ Maybe you want to use 'update --init'?")"
die "$(eval_gettext "Unable to fetch in 
submodule path '\$displaypath'")"
fi
 
-   # Is this something we just cloned?
-   case ";$cloned_modules;" in
-   *";$name;"*)
-   # then there is no local change to integrate
-   update_module=checkout ;;
-   esac
-
must_die_on_failure=
case "$update_module" in
checkout)
-- 
2.5.0.272.ga84127c.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv4 04/14] strbuf: add strbuf_read_once to read without blocking

2015-09-22 Thread Stefan Beller
The new call will read from a file descriptor into a strbuf once. The
underlying call xread_nonblock is meant to execute without blocking if
the file descriptor is set to O_NONBLOCK. It is a bug to call
strbuf_read_once on a file descriptor which would block.

Signed-off-by: Stefan Beller 
---
 strbuf.c | 11 +++
 strbuf.h |  9 +
 2 files changed, 20 insertions(+)

diff --git a/strbuf.c b/strbuf.c
index cce5eed..35e71b8 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -384,6 +384,17 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
return sb->len - oldlen;
 }
 
+ssize_t strbuf_read_once(struct strbuf *sb, int fd, size_t hint)
+{
+   ssize_t cnt;
+
+   strbuf_grow(sb, hint ? hint : 8192);
+   cnt = xread_nonblock(fd, sb->buf + sb->len, sb->alloc - sb->len - 1);
+   if (cnt > 0)
+   strbuf_setlen(sb, sb->len + cnt);
+   return cnt;
+}
+
 #define STRBUF_MAXLINK (2*PATH_MAX)
 
 int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
diff --git a/strbuf.h b/strbuf.h
index aef2794..ea69665 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -367,6 +367,15 @@ extern size_t strbuf_fread(struct strbuf *, size_t, FILE 
*);
 extern ssize_t strbuf_read(struct strbuf *, int fd, size_t hint);
 
 /**
+ * Read from a file descriptor that is marked as O_NONBLOCK without
+ * blocking.  Returns the number of new bytes appended to the sb.
+ * Negative return value signals there was an error returned from
+ * underlying read(2), in which case the caller should check errno.
+ * e.g. errno == EAGAIN when the read may have blocked.
+ */
+extern ssize_t strbuf_read_once(struct strbuf *, int fd, size_t hint);
+
+/**
  * Read the contents of a file, specified by its path. The third argument
  * can be used to give a hint about the file size, to avoid reallocs.
  */
-- 
2.5.0.272.ga84127c.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv4 01/14] submodule: Send "Fetching submodule " to standard error

2015-09-22 Thread Stefan Beller
From: Jonathan Nieder 

The "Pushing submodule " progress output correctly goes to
stderr, but "Fetching submodule " is going to stdout by mistake.
Fix it to write to stderr.

Noticed while trying to implement a parallel submodule fetch.  When
this particular output line went to a different file descriptor, it
was buffered separately, resulting in wrongly interleaved output if
we copied it to the terminal naively.

Signed-off-by: Jonathan Nieder 
Signed-off-by: Stefan Beller 
---
 submodule.c |  2 +-
 t/t5526-fetch-submodules.sh | 51 +++--
 2 files changed, 27 insertions(+), 26 deletions(-)

diff --git a/submodule.c b/submodule.c
index 9fcc86f..1d64e57 100644
--- a/submodule.c
+++ b/submodule.c
@@ -694,7 +694,7 @@ int fetch_populated_submodules(const struct argv_array 
*options,
git_dir = submodule_git_dir.buf;
if (is_directory(git_dir)) {
if (!quiet)
-   printf("Fetching submodule %s%s\n", prefix, 
ce->name);
+   fprintf(stderr, "Fetching submodule %s%s\n", 
prefix, ce->name);
cp.dir = submodule_path.buf;
argv_array_push(&argv, default_argv);
argv_array_push(&argv, "--submodule-prefix");
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index a4532b0..17759b1 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -16,7 +16,8 @@ add_upstream_commit() {
git add subfile &&
git commit -m new subfile &&
head2=$(git rev-parse --short HEAD) &&
-   echo "From $pwd/submodule" > ../expect.err &&
+   echo "Fetching submodule submodule" > ../expect.err &&
+   echo "From $pwd/submodule" >> ../expect.err &&
echo "   $head1..$head2  master -> origin/master" >> 
../expect.err
) &&
(
@@ -27,6 +28,7 @@ add_upstream_commit() {
git add deepsubfile &&
git commit -m new deepsubfile &&
head2=$(git rev-parse --short HEAD) &&
+   echo "Fetching submodule submodule/subdir/deepsubmodule" >> 
../expect.err
echo "From $pwd/deepsubmodule" >> ../expect.err &&
echo "   $head1..$head2  master -> origin/master" >> 
../expect.err
)
@@ -56,9 +58,7 @@ test_expect_success setup '
(
cd downstream &&
git submodule update --init --recursive
-   ) &&
-   echo "Fetching submodule submodule" > expect.out &&
-   echo "Fetching submodule submodule/subdir/deepsubmodule" >> expect.out
+   )
 '
 
 test_expect_success "fetch --recurse-submodules recurses into submodules" '
@@ -67,7 +67,7 @@ test_expect_success "fetch --recurse-submodules recurses into 
submodules" '
cd downstream &&
git fetch --recurse-submodules >../actual.out 2>../actual.err
) &&
-   test_i18ncmp expect.out actual.out &&
+   test_must_be_empty actual.out &&
test_i18ncmp expect.err actual.err
 '
 
@@ -96,7 +96,7 @@ test_expect_success "using fetchRecurseSubmodules=true in 
.gitmodules recurses i
git config -f .gitmodules 
submodule.submodule.fetchRecurseSubmodules true &&
git fetch >../actual.out 2>../actual.err
) &&
-   test_i18ncmp expect.out actual.out &&
+   test_must_be_empty actual.out &&
test_i18ncmp expect.err actual.err
 '
 
@@ -127,7 +127,7 @@ test_expect_success "--recurse-submodules overrides 
fetchRecurseSubmodules setti
git config --unset -f .gitmodules 
submodule.submodule.fetchRecurseSubmodules &&
git config --unset submodule.submodule.fetchRecurseSubmodules
) &&
-   test_i18ncmp expect.out actual.out &&
+   test_must_be_empty actual.out &&
test_i18ncmp expect.err actual.err
 '
 
@@ -146,7 +146,7 @@ test_expect_success "--dry-run propagates to submodules" '
cd downstream &&
git fetch --recurse-submodules --dry-run >../actual.out 
2>../actual.err
) &&
-   test_i18ncmp expect.out actual.out &&
+   test_must_be_empty actual.out &&
test_i18ncmp expect.err actual.err
 '
 
@@ -155,7 +155,7 @@ test_expect_success "Without --dry-run propagates to 
submodules" '
cd downstream &&
git fetch --recurse-submodules >../actual.out 2>../actual.err
) &&
-   test_i18ncmp expect.out actual.out &&
+   test_must_be_empty actual.out &&
test_i18ncmp expect.err actual.err
 '
 
@@ -166,7 +166,7 @@ test_expect_success "recurseSubmodules=true propagates into 
submodules" '
git config fetch.recurseSubmodules true
git fetch >../actual.out 2>../actual.err
) &&
-   test_i18ncmp expect.out actual.out &&
+   test_must_be_

[PATCHv4 03/14] xread_nonblock: add functionality to read from fds without blocking

2015-09-22 Thread Stefan Beller
Provide a wrapper to read(), similar to xread(), that restarts on
EINTR but not EAGAIN (or EWOULDBLOCK). This enables the caller to
handle polling itself, possibly polling multiple sockets or performing
some other action.

Helped-by: Jacob Keller 
Helped-by: Jeff King ,
Helped-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 git-compat-util.h |  1 +
 wrapper.c | 22 ++
 2 files changed, 23 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index c6d391f..9ccea85 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -718,6 +718,7 @@ extern void *xcalloc(size_t nmemb, size_t size);
 extern void *xmmap(void *start, size_t length, int prot, int flags, int fd, 
off_t offset);
 extern void *xmmap_gently(void *start, size_t length, int prot, int flags, int 
fd, off_t offset);
 extern ssize_t xread(int fd, void *buf, size_t len);
+extern ssize_t xread_nonblock(int fd, void *buf, size_t len);
 extern ssize_t xwrite(int fd, const void *buf, size_t len);
 extern ssize_t xpread(int fd, void *buf, size_t len, off_t offset);
 extern int xdup(int fd);
diff --git a/wrapper.c b/wrapper.c
index 5517928..41a21e1 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -217,6 +217,28 @@ ssize_t xread(int fd, void *buf, size_t len)
 }
 
 /*
+ * xread_nonblock() is the same a read(), but it automatically restarts read()
+ * interrupted operations (EINTR). xread_nonblock() DOES NOT GUARANTEE that
+ * "len" bytes is read. EWOULDBLOCK is turned into EAGAIN.
+ */
+ssize_t xread_nonblock(int fd, void *buf, size_t len)
+{
+   ssize_t nr;
+   if (len > MAX_IO_SIZE)
+   len = MAX_IO_SIZE;
+   while (1) {
+   nr = read(fd, buf, len);
+   if (nr < 0) {
+   if (errno == EINTR)
+   continue;
+   if (errno == EWOULDBLOCK)
+   errno = EAGAIN;
+   }
+   return nr;
+   }
+}
+
+/*
  * xwrite() is the same a write(), but it automatically restarts write()
  * operations with a recoverable error (EAGAIN and EINTR). xwrite() DOES NOT
  * GUARANTEE that "len" bytes is written even if the operation is successful.
-- 
2.5.0.272.ga84127c.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv4 00/14] fetch submodules in parallel and a preview on parallel "submodule update"

2015-09-22 Thread Stefan Beller
Today there was lots of discussion on the correct way of reading the
strbufs as well as some discussion on the structure of the
asynchronous parallel process loop.

Patches 1-8 bring parallel fetching of submodules, and have had some good 
exposure
to review and feedback is incorporated.

Patches 9-14 bring parallel submodule updates.
Patch 14 is not ready yet (i.e. test suite failures), but the cleanups before
in patch 9-13 can be reviewed without wasting time.

Any feedback welcome,
Thanks,
Stefan

Diff to v3 below. The patches can also be found at [1]
[1] https://github.com/stefanbeller/git/tree/submodulec_nonthreaded_parallel_4

Jonathan Nieder (1):
  submodule: Send "Fetching submodule " to standard error

Stefan Beller (13):
  xread: poll on non blocking fds
  xread_nonblock: add functionality to read from fds without blocking
  strbuf: add strbuf_read_once to read without blocking
  run-command: factor out return value computation
  run-command: add an asynchronous parallel child processor
  fetch_populated_submodules: use new parallel job processing
  submodules: allow parallel fetching, add tests and documentation
  submodule-config: Untangle logic in parse_config
  submodule config: keep update strategy around
  git submodule update: cmd_update_recursive
  git submodule update: cmd_update_clone
  git submodule update: cmd_update_fetch
  Rewrite submodule update in C

 Documentation/fetch-options.txt |   7 +
 builtin/fetch.c |   6 +-
 builtin/pull.c  |   6 +
 builtin/submodule--helper.c | 251 +
 git-compat-util.h   |   1 +
 git-submodule.sh| 339 ++--
 run-command.c   | 320 ++---
 run-command.h   |  36 +
 strbuf.c|  11 ++
 strbuf.h|   9 ++
 submodule-config.c  |  85 +-
 submodule-config.h  |   1 +
 submodule.c | 120 +-
 submodule.h |   2 +-
 t/t0061-run-command.sh  |  20 +++
 t/t5526-fetch-submodules.sh |  70 ++---
 test-run-command.c  |  24 +++
 wrapper.c   |  35 -
 18 files changed, 987 insertions(+), 356 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index baa7563..b79117a 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -382,10 +382,14 @@ static int update_next_task(void *data,
argv_array_pushf(&cp->env_array, "sm_path=%s", sub->path);
argv_array_pushf(&cp->env_array, "name=%s", sub->name);
argv_array_pushf(&cp->env_array, "url=%s", sub->url);
+   argv_array_pushf(&cp->env_array, "sha1=%s", 
sha1_to_hex(ce->sha1));
argv_array_pushf(&cp->env_array, "update_module=%s", 
update_module);
 
cp->git_cmd = 1;
+   cp->no_stdin = 1;
cp->stdout_to_stderr = 1;
+   cp->err = -1;
+   argv_array_init(&cp->args);
argv_array_push(&cp->args, "submodule");
if (!file_exists(sm_gitdir))
argv_array_push(&cp->args, "update_clone");
diff --git a/run-command.c b/run-command.c
index 06d5a5d..494e1f8 100644
--- a/run-command.c
+++ b/run-command.c
@@ -276,8 +276,10 @@ static int wait_or_whine(pid_t pid, const char *argv0)
failed_errno = errno;
error("waitpid for %s failed: %s", argv0, strerror(errno));
} else {
-   if (waiting != pid
-  || (determine_return_value(status, &code, &failed_errno, 
argv0) < 0))
+   if (waiting != pid || (determine_return_value(status,
+ &code,
+ &failed_errno,
+ argv0) < 0))
error("waitpid is confused (%s)", argv0);
}
 
@@ -870,7 +872,6 @@ struct parallel_processes {
 
int max_processes;
int nr_processes;
-   unsigned all_tasks_started : 1;
 
get_next_task_fn get_next_task;
start_failure_fn start_failure;
@@ -899,9 +900,9 @@ void default_start_failure(void *data,
struct strbuf sb = STRBUF_INIT;
 
for (i = 0; cp->argv[i]; i++)
-   strbuf_addf(&sb, "%s ", cp->argv[i]);
+   strbuf_addf(&sb, " %s", cp->argv[i]);
 
-   die_errno("Starting a child failed:\n%s", sb.buf);
+   die_errno("Starting a child failed:%s", sb.buf);
 }
 
 void default_return_value(void *data,
@@ -915,12 +916,12 @@ void default_return_value(void *data,
return;
 
for (i = 0; cp->argv[i]; i++)
-   strbuf_addf(&sb, "%s ", cp->argv[i]);
+   strbuf_addf(&sb, " %s", cp->argv[i]);
 

Re: [PATCHv3 02/13] xread: poll on non blocking fds

2015-09-22 Thread Junio C Hamano
Stefan Beller  writes:

> We should not care if the call to poll failed, as we're in an infinite loop 
> and
> can only get out with the correct read(..).

I think I agree with that reasoning.  The only thing we want out of
this call to poll(2) is to delay calling read(2) again when we know
we would get EAGAIN again; whether the reason why poll(2) returned
is because of some error or because some data has become ready on
the file descriptor, we no longer are in that "we know we would get
EAGAIN" situation and do want to call read(2), which would return
either the right error or perform a successful read to make some
progress.

Sounds sensible.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-svn: make batch mode optional for git-cat-file

2015-09-22 Thread Eric Wong
Eric Wong  wrote:
> Victor Leschuk  wrote:
> > The thing is that git-cat-file keeps growing during work when running
> > in "batch" mode. See the figure attached: it is for cloning a rather
> > small repo (1 hour to clone about ~14000 revisions). However the clone
> > of a large repo (~28 revisions) took about 2 weeks and
> > git-cat-file has outgrown the parent perl process several times
> > (git-cat-file - ~3-4Gb, perl - 400Mb).

How much of that is anonymous memory, though?
(pmap $PID_OF_GIT_CAT_FILE)

Running the following on the Linux kernel tree I had lying around:

(for i in $(seq 100 200); do git ls-files | sed -e "s/^/HEAD~$i:/"; done)|\
  git cat-file --batch >/dev/null

Reveals about 510M RSS in top, but pmap says less than 20M of that
is anonymous.  So the rest are mmap-ed packfiles; that RSS gets
transparently released back to the kernel under memory pressure.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 02/13] xread: poll on non blocking fds

2015-09-22 Thread Stefan Beller
On Tue, Sep 22, 2015 at 1:00 PM, Junio C Hamano  wrote:
> Jeff King  writes:
>
>> On Tue, Sep 22, 2015 at 12:45:51PM -0700, Junio C Hamano wrote:
>>
>>> One caveat is that the caller may not know in the first place.
>>>
>>> The last time I checked the existing callers of xread(), there were
>>> a few that read from a file descriptor they did not open themselves
>>> (e.g. unpack-objects that read from standard input).  The invoker of
>>> these processes is free to do O_NONBLOCK their input stream for
>>> whatever reason.
>>
>> Yeah. I do not think this is a bug at all; the user might have their
>> reasons for handing off an O_NONBLOCK pipe. If we take xread() to mean
>> "try to read from fd until we get a real error, some data, or an EOF",
>> then it is perfectly reasonable to replace spinning on read() (which we
>> do now) with a poll() for efficiency. The caller (and the user) does not
>> have to care, and should not notice; the outcome will be the same.
>
> I think we are in agreement, and that answers the question/guidance
> Stefan asked earlier in $gmane/278414, which was:
>
>> So rather a combination of both, with the warning only spewing every
>> 5 seconds or such?
>
> and the answer obviously is "No warning, do a poll() without timeout
> to block".

Ok. Expect that in a reroll.

To answer the first question upthread:

> I can sort of see EINTR but why is ENOMEM any special than other errors?

So we have 4 kinds of additional errors, when adding the poll. When we want to
keep the behavior as close to the original as possible, we should not error out
for things we had under control previously.

   EFAULT The array given as argument was not contained in the
calling program's address space.

   EINTR  A signal occurred before any requested event; see signal(7).

   EINVAL The nfds value exceeds the RLIMIT_NOFILE value.

   ENOMEM There was no space to allocate file descriptor tables.

I don't see a way EFAULT can be returned as we have the array hard
coded on the stack.
(No tricks with the heap or anything, we'll have it on the stack)

EINTR is possible while waiting. Actually it doesn't matter if we
retry the poll or read first
and then poll again, so we can easily just continue and do the read.

EINVAL is highly unlikely (as nfds == 1 in the case here), but can happen.
ENOMEM is similar to EINVAL.

We should not care if the call to poll failed, as we're in an infinite loop and
can only get out with the correct read(..). So maybe an implementation like this
would already suffice:

ssize_t xread(int fd, void *buf, size_t len)
{
ssize_t nr;
if (len > MAX_IO_SIZE)
len = MAX_IO_SIZE;
while (1) {
nr = read(fd, buf, len);
if (nr < 0) {
if (errno == EINTR)
continue;
if (errno == EAGAIN || errno == EWOULDBLOCK) {
struct pollfd pfd;
pfd.events = POLLIN;
pfd.fd = fd;
/* We deliberately ignore the return value of poll. */
poll(&pfd, 1, -1);
continue;
}
}
return nr;
}
}

In the resend I'll only check for EINTR and in other cases of early poll
failure, we just die(..).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-svn: make batch mode optional for git-cat-file

2015-09-22 Thread Eric Wong
Victor Leschuk  wrote:
> The thing is that git-cat-file keeps growing during work when running
> in "batch" mode. See the figure attached: it is for cloning a rather
> small repo (1 hour to clone about ~14000 revisions). However the clone
> of a large repo (~28 revisions) took about 2 weeks and
> git-cat-file has outgrown the parent perl process several times
> (git-cat-file - ~3-4Gb, perl - 400Mb).

Ugh, that sucks.
Even the 400Mb size of Perl annoys me greatly and I'd work
on fixing it if I had more time.

But I'm completely against adding this parameter to git-svn.
git-svn is not the only "cat-file --batch" user, so this option is
only hiding problems.

The best choice is to figure out why cat-file is wasting memory.

Disclaimer: I'm no expert on parts of git written in C,
but perhaps the alloc.c interface is why memory keeps growing.

> What was done:
>  * I have run it under valgrind and mtrace and haven't found any memory leaks
>  * Found the source of most number of memory reallocations 
> (batch_object_write() function (strbuf_expand -> realloc)) - tried to make 
> the streambuf object static and avoid reallocs - didn't help
>  * Tried preloading other allocators than standard glibc - no significant 
> difference

A few more questions:

* What is the largest file that existed in that repo?

* Did you try "MALLOC_MMAP_THRESHOLD_" with glibc?

  Perhaps setting that to 131072 will help, that'll force releasing
  larger chunks than that; but it might be moot if alloc.c is
  getting in the way.

If alloc.c is the culprit, I would consider to transparently restart
"cat-file --batch" once it grows to a certain size or after a certain
number of requests are made to it.

We can probably do this inside "git cat-file" itself without
changing any callers by calling execve.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


t5561 failing after make PROFILE=GEN

2015-09-22 Thread Stephan Beyer
Hi,

I noticed that t5561 fails on my machine when compiling with
"make PROFILE=GEN". Luckily, the reason seems to be the test only,
not the tool it is testing.

I tracked it down that far that log_div() (defined in
t/t5561-http-backend.sh but used in t/t556x_common) appends
the given text to the access.log *before* the last GET log entry
is written.

The test code does it right (as far as I managed to look over it),
so this is maybe some odd flushing behavior of the web server?
On the other hand, the problem only occurs with PROFILE=GEN
but the web server should be independent of the Git compile-time
configuration, right? Looks weird to me but I did not dig deep.

Replacing the log_div() implementation by "return 0" and removing
the implied output solves the problem without breaking any test
functionality. (For more clarity, the log_div() calls and definitions
should be removed.) I refrained from sending this trivial patch
because I am not sure if this is the right way to cope with the issue.

Best
  Stephan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 06/13] run-command: add an asynchronous parallel child processor

2015-09-22 Thread Junio C Hamano
Stefan Beller  writes:

> On Tue, Sep 22, 2015 at 2:41 PM, Junio C Hamano  wrote:
>> Stefan Beller  writes:
>>
>>> On Tue, Sep 22, 2015 at 12:53 PM, Junio C Hamano  wrote:
 Stefan Beller  writes:

> So how would you find out when we are done?

 while (1) {
 if (we have available slot)
 ask to start a new one;
 if (nobody is running anymore)
 break;
 collect the output from running ones;
 drain the output from the foreground one;
 cull the finished process;
 }

>>>
>>> Personally I do not like the break; in the middle of
>>> the loop, but that's personal preference, I'd guess.
>>> I'll also move the condition for (we have available slot)
>>> back inside the called function.
>>>
>>> So I am thinking about using this in the reroll instead:
>>>
>>> run_processes_parallel_start_as_needed(&pp);
>>> while (pp.nr_processes > 0) {
>>> run_processes_parallel_buffer_stderr(&pp);
>>> run_processes_parallel_output(&pp);
>>> run_processes_parallel_collect_finished(&pp);
>>> run_processes_parallel_start_as_needed(&pp);
>>> }
>>
>> If you truly think the latter is easier to follow its logic (with
>> the duplicated call to the same function only to avoid break that
>> makes it clear why we are quitting the loop,
>
> I dislike having the call twice, too.
> ...
>> and without the
>> explicit "start only if we can afford to"), then I have to say that
>> our design tastes are fundamentally incompatible.
> ...
> I'll think about that.

Don't waste too much time on it ;-)  This is largely a taste thing,
and taste is very hard to reason about, understand, teach and learn.

Having said that, if I can pick one thing that I foresee to be
problematic the most (aside from these overlong names of the
functions that are private and do not need such long names), it is
that "start as many without giving any control to the caller"
interface.  I wrote "start *a* new one" in the outline above for a
reason.

Even if you want to utilize a moderate number of processes, say 16,
working at the steady state, I suspect that we would find it
suboptimal from perceived latency point of view, if we spin up 16
processes all at once at the very beginning before we start to
collect output from the designated foreground process and relay its
first line of output.  We may want to be able to tweak the caller to
spin up just a few first and let the loop ramp up to the full blast
gradually so that we can give an early feedback to the end user.
That is not something easy to arrange with "start as many without
giving control to the caller" interface.  We probably will discover
other kinds of scheduling issues once we get familiar with this
machinery and would find need for finer grained control.

And I consider such a ramp-up logic shouldn't be hidden inside the
"start_as_needed()" helper function---it is the way how the calling
loop wants its processes started, and I'd prefer to have the logic
visible in the caller's loop.

But that is also largely a "taste" thing.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] notes: correct documentation of DWIMery for notes references

2015-09-22 Thread Jacob Keller
From: Jacob Keller 

expand_notes_ref is used by --ref from git-notes(1) and --notes from the
git log to find the full refname of a notes reference. Previously the
documentation of these options was not clear about what sorts of
expansions would be performed. Fix the documentation to clearly and
accurately describe the behavior of the expansions.

Add a test for this expansion when using git notes get-ref in order to
prevent future patches from changing this behavior.

Signed-off-by: Jacob Keller 
---
 Documentation/git-notes.txt  | 4 +++-
 Documentation/pretty-options.txt | 5 +++--
 t/t3301-notes.sh | 6 ++
 3 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-notes.txt b/Documentation/git-notes.txt
index a9a916f360ec..8de349968a3b 100644
--- a/Documentation/git-notes.txt
+++ b/Documentation/git-notes.txt
@@ -162,7 +162,9 @@ OPTIONS
 --ref ::
Manipulate the notes tree in .  This overrides
'GIT_NOTES_REF' and the "core.notesRef" configuration.  The ref
-   is taken to be in `refs/notes/` if it is not qualified.
+   specifies the full refname when it begins with `refs/notes/`; when it
+   begins with `notes/`, `refs/` and otherwise `refs/notes/` is prefixed
+   to form a full name of the ref.
 
 --ignore-missing::
Do not consider it an error to request removing notes from an
diff --git a/Documentation/pretty-options.txt b/Documentation/pretty-options.txt
index 8d6c5cec4c5e..4b659ac1a6a6 100644
--- a/Documentation/pretty-options.txt
+++ b/Documentation/pretty-options.txt
@@ -55,8 +55,9 @@ By default, the notes shown are from the notes refs listed in 
the
 environment overrides). See linkgit:git-config[1] for more details.
 +
 With an optional '' argument, show this notes ref instead of the
-default notes ref(s). The ref is taken to be in `refs/notes/` if it
-is not qualified.
+default notes ref(s). The ref specifies the full refname when it begins
+with `refs/notes/`; when it begins with `notes/`, `refs/` and otherwise
+`refs/notes/` is prefixed to form a full name of the ref.
 +
 Multiple --notes options can be combined to control which notes are
 being displayed. Examples: "--notes=foo" will show only notes from
diff --git a/t/t3301-notes.sh b/t/t3301-notes.sh
index 06b45847c147..2d200fdf36c6 100755
--- a/t/t3301-notes.sh
+++ b/t/t3301-notes.sh
@@ -1132,6 +1132,12 @@ test_expect_success 'git notes copy diagnoses too many 
or too few parameters' '
test_must_fail git notes copy one two three
 '
 
+test_expect_success 'git notes get-ref expands refs/heads/master to 
refs/notes/refs/heads/master' '
+   test_unconfig core.notesRef &&
+   sane_unset GIT_NOTES_REF &&
+   test "$(git notes --ref=refs/heads/master get-ref)" = 
"refs/notes/refs/heads/master"
+'
+
 test_expect_success 'git notes get-ref (no overrides)' '
test_unconfig core.notesRef &&
sane_unset GIT_NOTES_REF &&
-- 
2.6.0.rc3.238.gc07a1e8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 0/7] git-p4: add support for large file systems

2015-09-22 Thread Junio C Hamano
larsxschnei...@gmail.com writes:

> From: Lars Schneider 
>
> diff to v6:
> * fix spaces in .gitattributes for Git-LFS files (old "[:space:]", new 
> "[[:space:]]")
> * generated patch on top of next (95c4325) to resolve merge conflicts
>
> I am sorry about the "[:space:]" bug this late. I had the fix in my 
> development
> branch but missed to apply it in this patch series.
>
> @Junio:
> Your conflict resolution in pu looks good. Nevertheless, I made this patch on
> top of next to ease the integration. I hope this is ok.

Please don't rebase without a good reason, especially after you
checked the conflict resolution is OK and your reroll does not
affect the way the conflicts are resolved.  If you based your
v6 patch on v2.6.0-rc0 and then your v7 patch needs to use something
that did not exist v2.6.0-rc3, you cannot avoid rebasing on top of
that newer codebase to use that new feature, but otherwise, no.

This is because I prefer to apply the new series to the same base
version so that each step can be compared with the corresponding
step in the previous round.

I even try hard to keep the commits from the older round if the
patch text and log message are unchanged.  This time, I had to
backport [v7 6/7] to apply to the same base before noticing and
verifying that [v7 7/7] is the only thing that was changed in this
round.  All the other ones turned out to be identical.

Hence, the end result for me was

$ git checkout ls/p4-lfs
$ git reset --hard HEAD^
$ git am -s git-p4-lfs-7-of-7.mbox

but it took me a lot longer than necessary because of the rebase.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 06/13] run-command: add an asynchronous parallel child processor

2015-09-22 Thread Stefan Beller
On Tue, Sep 22, 2015 at 2:41 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> On Tue, Sep 22, 2015 at 12:53 PM, Junio C Hamano  wrote:
>>> Stefan Beller  writes:
>>>
 So how would you find out when we are done?
>>>
>>> while (1) {
>>> if (we have available slot)
>>> ask to start a new one;
>>> if (nobody is running anymore)
>>> break;
>>> collect the output from running ones;
>>> drain the output from the foreground one;
>>> cull the finished process;
>>> }
>>>
>>
>> Personally I do not like the break; in the middle of
>> the loop, but that's personal preference, I'd guess.
>> I'll also move the condition for (we have available slot)
>> back inside the called function.
>>
>> So I am thinking about using this in the reroll instead:
>>
>> run_processes_parallel_start_as_needed(&pp);
>> while (pp.nr_processes > 0) {
>> run_processes_parallel_buffer_stderr(&pp);
>> run_processes_parallel_output(&pp);
>> run_processes_parallel_collect_finished(&pp);
>> run_processes_parallel_start_as_needed(&pp);
>> }
>
> If you truly think the latter is easier to follow its logic (with
> the duplicated call to the same function only to avoid break that
> makes it clear why we are quitting the loop,

I dislike having the call twice, too.

> and without the
> explicit "start only if we can afford to"), then I have to say that
> our design tastes are fundamentally incompatible.

Well the "start only if we can afford to" is not as easy as just

> if (we have available slot)
> ask to start a new one;

because that's only half the condition. If we don't start a new one
as the callback get_next_task returned without a new task.
So it becomes

while (pp.nr_processes > 0) {
while (pp->nr_processes < pp->max_processes &&
start_one_process(&pp))
; /* nothing */
if (!pp.nr_processes)
break;
buffer(..);
output_live_child(..);
cull_finished(..);
}

I'll think about that.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 1/7] git-p4: add optional type specifier to gitConfig reader

2015-09-22 Thread Junio C Hamano
larsxschnei...@gmail.com writes:

> From: Lars Schneider 
>
> The functions “gitConfig” and “gitConfigBool” are almost identical. Make 
> “gitConfig” more generic by adding an optional type specifier. Use the type 
> specifier “—bool” with “gitConfig” to implement “gitConfigBool. This prepares 
> the implementation of other type specifiers such as “—int”.

What is this overlong single line paragraph?  Is this a MUA artifact
on my end?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 06/13] run-command: add an asynchronous parallel child processor

2015-09-22 Thread Junio C Hamano
Stefan Beller  writes:

> On Tue, Sep 22, 2015 at 12:53 PM, Junio C Hamano  wrote:
>> Stefan Beller  writes:
>>
>>> So how would you find out when we are done?
>>
>> while (1) {
>> if (we have available slot)
>> ask to start a new one;
>> if (nobody is running anymore)
>> break;
>> collect the output from running ones;
>> drain the output from the foreground one;
>> cull the finished process;
>> }
>>
>
> Personally I do not like the break; in the middle of
> the loop, but that's personal preference, I'd guess.
> I'll also move the condition for (we have available slot)
> back inside the called function.
>
> So I am thinking about using this in the reroll instead:
>
> run_processes_parallel_start_as_needed(&pp);
> while (pp.nr_processes > 0) {
> run_processes_parallel_buffer_stderr(&pp);
> run_processes_parallel_output(&pp);
> run_processes_parallel_collect_finished(&pp);
> run_processes_parallel_start_as_needed(&pp);
> }

If you truly think the latter is easier to follow its logic (with
the duplicated call to the same function only to avoid break that
makes it clear why we are quitting the loop, and without the
explicit "start only if we can afford to"), then I have to say that
our design tastes are fundamentally incompatible.  I have nothing
more to add.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Makefile: Change library order for static linking

2015-09-22 Thread Junio C Hamano
Remi Pommarel  writes:

> I have one last question thought. Wouldn't it be nice if we had
> configure to autodetect the need for -lssl with libcurl?

Sure.  I do not think anybody tried, but it would be nice.

> We could make
> configure to check for Curl_ssl_init function symbol in libcurl,
> which is only present if libcurl has been compiled with openssl support,
> by adding something like that in configure.ac:
>
> AC_CHECK_LIB([curl], [Curl_ssl_init],
>   [NEEDS_SSL_WITH_CURL=YesPlease],
>   [NEEDS_SSL_WITH_CURL=])
>
> The thing that I'm afraid of is that checking a function that is not part
> of official libcurl API could be not very reliable, don't you think ?

That is true.

To be bluntly honest, use of autoconf (and configure generated by
it) in this project is optional, so I would not worry too much if
you misidentify a version of cURL that does not need -lssl as
needing one (or vice versa).  As long as other parts of the
resulting ./configure keeps working (read: emitting a syntactically
broken shell script is not an OK offence) and if there is a way to
work around the result of misidentification made by ./configure
(read: config.mak can cure all gotchas made by config.mak.autogen),
it would be fine to use something like the above snippet you gave us
as the starting point and it will help some positive number of
people.  We have to start from somewhere if we want to get there.

People who find versions of libcURL that gets misidentified will
send in fixes if it matters to them.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 06/13] run-command: add an asynchronous parallel child processor

2015-09-22 Thread Stefan Beller
On Tue, Sep 22, 2015 at 12:53 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> So how would you find out when we are done?
>
> while (1) {
> if (we have available slot)
> ask to start a new one;
> if (nobody is running anymore)
> break;
> collect the output from running ones;
> drain the output from the foreground one;
> cull the finished process;
> }
>

Personally I do not like the break; in the middle of
the loop, but that's personal preference, I'd guess.
I'll also move the condition for (we have available slot)
back inside the called function.

So I am thinking about using this in the reroll instead:

run_processes_parallel_start_as_needed(&pp);
while (pp.nr_processes > 0) {
run_processes_parallel_buffer_stderr(&pp);
run_processes_parallel_output(&pp);
run_processes_parallel_collect_finished(&pp);
run_processes_parallel_start_as_needed(&pp);
}


This eliminates the need for the flag and also exits the loop just
after the possible startup of new processes.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] notes: document behavior of --ref and --notes DWIMery

2015-09-22 Thread Junio C Hamano
"Keller, Jacob E"  writes:

> On Tue, 2015-09-22 at 13:40 -0700, Junio C Hamano wrote:
>> How about phrasing it totally differently?
>> 
>>  The ref specifies the full refname when it begins with
>>  `refs/notes/`; otherwise `ref/notes/` is prefixed to form a
>>  full name of the ref.
>> 
>> I think that would remove the need to illustrate with concrete
>> examples like refs/heads/blah.
>> 
>
> Wait, what about the DWIM of notes/ goes to refs/notes/..
> do we need to explain that here?

Yeah, throw that in, too.

... when it begins with `refs/notes/`; when it begins with
`notes/`, `refs/` and otherwise `refs/notes/` is prefixed to
form a full name of the ref.

We could emphasize the end result by ending the sentence like so:

... form a full name of the ref that is under `refs/notes/`.

if you wanted to.  I am neutral (I do not think it would hurt, but I
do not think it adds much clarity).

Thanks.



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] notes: document behavior of --ref and --notes DWIMery

2015-09-22 Thread Keller, Jacob E
On Tue, 2015-09-22 at 13:40 -0700, Junio C Hamano wrote:
> How about phrasing it totally differently?
> 
>   The ref specifies the full refname when it begins with
>   `refs/notes/`; otherwise `ref/notes/` is prefixed to form a
>   full name of the ref.
> 
> I think that would remove the need to illustrate with concrete
> examples like refs/heads/blah.
> 

Wait, what about the DWIM of notes/ goes to refs/notes/..
do we need to explain that here? it might seem that "notes/foo" ends up
as "refs/notes/notes/foo" which is not really what we mean.

Regards,
JakeN�r��yb�X��ǧv�^�)޺{.n�+ا���ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf

Re: [PATCH] notes: document behavior of --ref and --notes DWIMery

2015-09-22 Thread Keller, Jacob E
On Tue, 2015-09-22 at 13:40 -0700, Junio C Hamano wrote:
> Jacob Keller  writes:
> 
> > From: Jacob Keller 
> > 
> > The --notes and --ref parameter for selecting which notes ref to
> > operate
> > on are based off of expand_notes_ref functionality. The
> > documentation
> > mentioned that an unqualified ref argument would be taken as under
> > `refs/notes/`. However, this does not clearly indicate that
> > `refs/heads/master` will expand to `refs/notes/refs/heads/master`,
> > so
> > document this behavior.
> > 
> > Add a further test for the expected behavior of git notes --ref
> > refs/heads/master get-ref as well, to ensure future patches do not
> > break
> > this assumption.
> > 
> > Signed-off-by: Jacob Keller 
> > ---
> 
> Looks OK to a cursory read, but I find "even if it is qualified
> under some other location" a bit tiring to read without adding much
> value.  To readers who consider "other" in that phrase to be clear
> enough (i.e. "location other than refs/notes"), it is totally
> redundant.  To other readers who feel "other" in that phrase to be
> under qualified (i.e. "location other than what???"), it is not
> informative enough.  Middle-ground readers who would not know if
> "refs/a" is inside or outside some "other" location, it is confusing.
> 
> After all, "a/b" is qualified under some location (i.e. a/) other
> than "refs/notes/", and it does mean "refs/notes/a/b".
> 
> How about phrasing it totally differently?
> 
>   The ref specifies the full refname when it begins with
>   `refs/notes/`; otherwise `ref/notes/` is prefixed to form a
>   full name of the ref.
> 
> I think that would remove the need to illustrate with concrete
> examples like refs/heads/blah.
> 

Yes, let's go with that.

Regards,
Jake

Re: Unable to create temporary file '/var/git/tmv3-target-overlay.git/shallow_Un8ZOR': Permission denied

2015-09-22 Thread Joakim Tjernlund
On Tue, 2015-09-22 at 22:00 +0200, Johannes Schindelin wrote:
> Hi Joakim,
> 
> On 2015-09-21 19:08, Joakim Tjernlund wrote:
> > On Mon, 2015-09-21 at 09:48 -0700, Junio C Hamano wrote:
> > > Duy Nguyen  writes:
> > > 
> > > > Is it really necessary to remove write access in $GIT_DIR? Do we (git
> > > > devs) have some guidelines about things in $GIT_DIR?
> > > 
> > > Those who are allowed to "git push" into it should be able to write
> > > there.  It is a different matter that "git" program itself may make
> > > a policy decision to disallow some operations that the filesystem
> > > bits alone would have allowed (e.g. you can arrange the "pusher" to
> > > only come over the wire via "receive-pack" and "receive-pack" may
> > > deliberately lack support for writing into $GIT_DIR/config).
> > > 
> > 
> > I view $GIT_DIR similar to "/" and "/tmp". Normally one does not let
> > normal users write to "/"
> > as you want to keep this level clean. It is not obvious to everybody
> > what files are important
> > under $GIT_DIR when mixed with tmp files etc.
> > $GIT_DIR/tmp would solve this nicely.
> 
> By now it is pretty clear that you won't find many people here you share your 
> opinion about locking down the
> Git directory.

So I note.

> 
> The reason should be easy to understand: Git's concept is based on the idea 
> that you have full control over
> your repository. Other repositories you might only have read access.

Yes and some repos I only have partial write access to(config, hooks etc. might 
be readonly) 

> 
> But this idea you have, to somehow introduce fine-grained levels of control, 
> this idea would imply that all
> of a sudden Git is no longer free to write to its files as it likes. And as 
> far as Git is concerned,
> everything inside .git/ *are* its files.

This does not compute for me, files inside git are git's files, I only think 
that not all users
to a repo should have the same (write) access. In this case it mostly to 
protect the repo from "creative"
users and accidents.

> 
> So in essence, the core concept of Git -- you clone a repository you cannot 
> write to so that you have a
> local repository you can do *anything you like* to -- is pretty much 
> incompatible with this idea of a
> selective lock down of files in .git/ that not only would require you to know 
> very exactly what files Git
> might want to write, but also to keep yourself up-to-date with Git's 
> development as to which files it might


Don't see how I can avoid some of that if you want to protect areas of the repo 
from accidents etc.

> want to write for *every* new version. Making only .git/tmp/ a writable 
> location further fails to
> acknowledge the fact that the hierarchy of to-be-written files follows the 
> function of those files, not any


Curious, how would you set up some level of protection on a repo?

A .git/tmp/ would make housekeeping easier, you would know that every file 
under .git
should be there and if you find something you don't recognize you would react.

> write permission hierarchy. Since the idea seems so alien to Git's core 
> concept, I called it "overzealous".
> If that hurt your feelings, I am sorry and would like to apologize.

No feelings hurt, I too regret my choise of words.

> Having said all that, I believe that reiterating this idea without pointing 
> to any benefit will continue to
> fail to convince people that the idea is sound and that Git's core concept 
> should change. If you need to
> exert more control in a specific repository, you simply make it accessible 
> only as a non-file-system remote
> (where only `git`, `git-receive-pack` and `git-upload-pack` are allowed to be 
> executed) and define hooks
> that can accept or deny on a *much* finer level than file system permissions 
> ever could, after all.

Even if I did go through this hassle, I would prefer if temporary data were put 
somewhere else
than .git/ as I think mixing config/persistent data with temporary data in the 
same directory is something
that should be avoided.

Anyhow, I see that this idea is not something upstream agrees on so I will back 
off now.

  Jocke--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Specifying N revisions after the initial commit

2015-09-22 Thread Junio C Hamano
Theodore Ts'o  writes:

> On Tue, Sep 22, 2015 at 04:11:23PM -0400, Josh Boyer wrote:
>> Oh, context would help, yes.  In the case of the tree I'm parsing, I
>> know for a fact that the commit history is entirely linear and will
>> (should) always remain so.  E.g.
>> 
>> A - B - C - D - E - F ... {N}
>> 
>> So yes, finding e.g. the second commit after the root is complicated
>> for something resembling anything like a typical git repo, but this
>> isn't like that.  In other words, I can cheat.  Or at least I'm pretty
>> sure I can cheat :).
>
> I'd suggest making your script makes sure "git rev-list --merges A..N"
> doesn't output any commits, so you know for sure that the commit
> history is linear.  That way you'll be certain that you can cheat.  :-)

There are histories with multiple roots without any merges, in which
case you cannot ;-)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Specifying N revisions after the initial commit

2015-09-22 Thread Theodore Ts'o
On Tue, Sep 22, 2015 at 04:11:23PM -0400, Josh Boyer wrote:
> Oh, context would help, yes.  In the case of the tree I'm parsing, I
> know for a fact that the commit history is entirely linear and will
> (should) always remain so.  E.g.
> 
> A - B - C - D - E - F ... {N}
> 
> So yes, finding e.g. the second commit after the root is complicated
> for something resembling anything like a typical git repo, but this
> isn't like that.  In other words, I can cheat.  Or at least I'm pretty
> sure I can cheat :).

I'd suggest making your script makes sure "git rev-list --merges A..N"
doesn't output any commits, so you know for sure that the commit
history is linear.  That way you'll be certain that you can cheat.  :-)

- Ted
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] notes: document behavior of --ref and --notes DWIMery

2015-09-22 Thread Junio C Hamano
Jacob Keller  writes:

> From: Jacob Keller 
>
> The --notes and --ref parameter for selecting which notes ref to operate
> on are based off of expand_notes_ref functionality. The documentation
> mentioned that an unqualified ref argument would be taken as under
> `refs/notes/`. However, this does not clearly indicate that
> `refs/heads/master` will expand to `refs/notes/refs/heads/master`, so
> document this behavior.
>
> Add a further test for the expected behavior of git notes --ref
> refs/heads/master get-ref as well, to ensure future patches do not break
> this assumption.
>
> Signed-off-by: Jacob Keller 
> ---

Looks OK to a cursory read, but I find "even if it is qualified
under some other location" a bit tiring to read without adding much
value.  To readers who consider "other" in that phrase to be clear
enough (i.e. "location other than refs/notes"), it is totally
redundant.  To other readers who feel "other" in that phrase to be
under qualified (i.e. "location other than what???"), it is not
informative enough.  Middle-ground readers who would not know if
"refs/a" is inside or outside some "other" location, it is confusing.

After all, "a/b" is qualified under some location (i.e. a/) other
than "refs/notes/", and it does mean "refs/notes/a/b".

How about phrasing it totally differently?

The ref specifies the full refname when it begins with
`refs/notes/`; otherwise `ref/notes/` is prefixed to form a
full name of the ref.

I think that would remove the need to illustrate with concrete
examples like refs/heads/blah.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Makefile: Change library order for static linking

2015-09-22 Thread Remi Pommarel
On Mon, Sep 21, 2015 at 10:09:52AM -0700, Junio C Hamano wrote:
> I think I said this already, but I found these bits from your patch
> 
> - $(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)
> + $(CURL_LIBCURL) $(EXPAT_LIBEXPAT) $(LIBS)
> 
> that first uses libcurl and then libs (which has ssl bits) totally
> sensible.  Shouldn't that be sufficient?

That is not sufficient on its own because -lssl is not added in $(LIBS), it
is only added in $(CURL_LIBCURL) if NEEDS_SSL_WITH_CURL is set.

> 
> > The proper variable I should have used is
> > NEEDS_SSL_WITH_CURL. But this variable is not setted on Linux and not
> > configurable,...
> 
> Really?  Anything is configurable from your config.mak, I would have
> thought.

You right using the parts you find sensible and adding
"NEEDS_SSL_WITH_CURL=YesPlease" in config.mak makes git compile fine. So
I will resubmit with only the sensible part of the patch.

I have one last question thought. Wouldn't it be nice if we had
configure to autodetect the need for -lssl with libcurl ? We could make
configure to check for Curl_ssl_init function symbol in libcurl,
which is only present if libcurl has been compiled with openssl support,
by adding something like that in configure.ac:

AC_CHECK_LIB([curl], [Curl_ssl_init],
[NEEDS_SSL_WITH_CURL=YesPlease],
[NEEDS_SSL_WITH_CURL=])

The thing that I'm afraid of is that checking a function that is not part
of official libcurl API could be not very reliable, don't you think ?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] notes: document behavior of --ref and --notes DWIMery

2015-09-22 Thread Jacob Keller
From: Jacob Keller 

The --notes and --ref parameter for selecting which notes ref to operate
on are based off of expand_notes_ref functionality. The documentation
mentioned that an unqualified ref argument would be taken as under
`refs/notes/`. However, this does not clearly indicate that
`refs/heads/master` will expand to `refs/notes/refs/heads/master`, so
document this behavior.

Add a further test for the expected behavior of git notes --ref
refs/heads/master get-ref as well, to ensure future patches do not break
this assumption.

Signed-off-by: Jacob Keller 
---
 Documentation/git-notes.txt  | 4 +++-
 Documentation/pretty-options.txt | 5 +++--
 t/t3301-notes.sh | 6 ++
 3 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-notes.txt b/Documentation/git-notes.txt
index a9a916f360ec..2ed9c16a1aac 100644
--- a/Documentation/git-notes.txt
+++ b/Documentation/git-notes.txt
@@ -162,7 +162,9 @@ OPTIONS
 --ref ::
Manipulate the notes tree in .  This overrides
'GIT_NOTES_REF' and the "core.notesRef" configuration.  The ref
-   is taken to be in `refs/notes/` if it is not qualified.
+   is taken to be in `refs/notes/` even if it is qualified under some
+   other location; in other words, `refs/heads/master` will be expanded
+   to `refs/notes/refs/heads/master`.
 
 --ignore-missing::
Do not consider it an error to request removing notes from an
diff --git a/Documentation/pretty-options.txt b/Documentation/pretty-options.txt
index 8d6c5cec4c5e..abcb787e7149 100644
--- a/Documentation/pretty-options.txt
+++ b/Documentation/pretty-options.txt
@@ -55,8 +55,9 @@ By default, the notes shown are from the notes refs listed in 
the
 environment overrides). See linkgit:git-config[1] for more details.
 +
 With an optional '' argument, show this notes ref instead of the
-default notes ref(s). The ref is taken to be in `refs/notes/` if it
-is not qualified.
+default notes ref(s). The ref is taken to be in `refs/notes/` even if it is
+qualified under some other location; in other words, `refs/heads/master`
+will be expanded to `refs/notes/refs/heads/master`.
 +
 Multiple --notes options can be combined to control which notes are
 being displayed. Examples: "--notes=foo" will show only notes from
diff --git a/t/t3301-notes.sh b/t/t3301-notes.sh
index 06b45847c147..2d200fdf36c6 100755
--- a/t/t3301-notes.sh
+++ b/t/t3301-notes.sh
@@ -1132,6 +1132,12 @@ test_expect_success 'git notes copy diagnoses too many 
or too few parameters' '
test_must_fail git notes copy one two three
 '
 
+test_expect_success 'git notes get-ref expands refs/heads/master to 
refs/notes/refs/heads/master' '
+   test_unconfig core.notesRef &&
+   sane_unset GIT_NOTES_REF &&
+   test "$(git notes --ref=refs/heads/master get-ref)" = 
"refs/notes/refs/heads/master"
+'
+
 test_expect_success 'git notes get-ref (no overrides)' '
test_unconfig core.notesRef &&
sane_unset GIT_NOTES_REF &&
-- 
2.6.0.rc3.238.gc07a1e8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Specifying N revisions after the initial commit

2015-09-22 Thread Josh Boyer
On Tue, Sep 22, 2015 at 3:55 PM, Junio C Hamano  wrote:
> Josh Boyer  writes:
>
>> On Tue, Sep 22, 2015 at 2:40 PM, Konstantin Khomoutov
>> ...
>>> Hence, given any particular commit, you're able to trace all of its
>>> ancestry, but the reverse is not possible.
>>
>> That makes sense.  I suppose I will have to resort to parsing output
>> of git-rev-list or something.  Thanks for the reminder.
>
> I think Konstantin explained why it fundamentally does not make
> sense to ask "which one is the Nth one after the root".  I am not
> sure how running rev-list and count its output would help, unless
> you are now solving a different problem (perhaps "find all the ones
> that are Nth after some root", which does have an answer).

Oh, context would help, yes.  In the case of the tree I'm parsing, I
know for a fact that the commit history is entirely linear and will
(should) always remain so.  E.g.

A - B - C - D - E - F ... {N}

So yes, finding e.g. the second commit after the root is complicated
for something resembling anything like a typical git repo, but this
isn't like that.  In other words, I can cheat.  Or at least I'm pretty
sure I can cheat :).

josh
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Unable to create temporary file '/var/git/tmv3-target-overlay.git/shallow_Un8ZOR': Permission denied

2015-09-22 Thread Johannes Schindelin
Hi Joakim,

On 2015-09-21 19:08, Joakim Tjernlund wrote:
> On Mon, 2015-09-21 at 09:48 -0700, Junio C Hamano wrote:
>> Duy Nguyen  writes:
>>
>> > Is it really necessary to remove write access in $GIT_DIR? Do we (git
>> > devs) have some guidelines about things in $GIT_DIR?
>>
>> Those who are allowed to "git push" into it should be able to write
>> there.  It is a different matter that "git" program itself may make
>> a policy decision to disallow some operations that the filesystem
>> bits alone would have allowed (e.g. you can arrange the "pusher" to
>> only come over the wire via "receive-pack" and "receive-pack" may
>> deliberately lack support for writing into $GIT_DIR/config).
>>
> 
> I view $GIT_DIR similar to "/" and "/tmp". Normally one does not let
> normal users write to "/"
> as you want to keep this level clean. It is not obvious to everybody
> what files are important
> under $GIT_DIR when mixed with tmp files etc.
> $GIT_DIR/tmp would solve this nicely.

By now it is pretty clear that you won't find many people here you share your 
opinion about locking down the Git directory.

The reason should be easy to understand: Git's concept is based on the idea 
that you have full control over your repository. Other repositories you might 
only have read access.

But this idea you have, to somehow introduce fine-grained levels of control, 
this idea would imply that all of a sudden Git is no longer free to write to 
its files as it likes. And as far as Git is concerned, everything inside .git/ 
*are* its files.

So in essence, the core concept of Git -- you clone a repository you cannot 
write to so that you have a local repository you can do *anything you like* to 
-- is pretty much incompatible with this idea of a selective lock down of files 
in .git/ that not only would require you to know very exactly what files Git 
might want to write, but also to keep yourself up-to-date with Git's 
development as to which files it might want to write for *every* new version. 
Making only .git/tmp/ a writable location further fails to acknowledge the fact 
that the hierarchy of to-be-written files follows the function of those files, 
not any write permission hierarchy. Since the idea seems so alien to Git's core 
concept, I called it "overzealous". If that hurt your feelings, I am sorry and 
would like to apologize.

Having said all that, I believe that reiterating this idea without pointing to 
any benefit will continue to fail to convince people that the idea is sound and 
that Git's core concept should change. If you need to exert more control in a 
specific repository, you simply make it accessible only as a non-file-system 
remote (where only `git`, `git-receive-pack` and `git-upload-pack` are allowed 
to be executed) and define hooks that can accept or deny on a *much* finer 
level than file system permissions ever could, after all.

Ciao,
Johannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 02/13] xread: poll on non blocking fds

2015-09-22 Thread Junio C Hamano
Jeff King  writes:

> On Tue, Sep 22, 2015 at 12:45:51PM -0700, Junio C Hamano wrote:
>
>> One caveat is that the caller may not know in the first place.
>> 
>> The last time I checked the existing callers of xread(), there were
>> a few that read from a file descriptor they did not open themselves
>> (e.g. unpack-objects that read from standard input).  The invoker of
>> these processes is free to do O_NONBLOCK their input stream for
>> whatever reason.
>
> Yeah. I do not think this is a bug at all; the user might have their
> reasons for handing off an O_NONBLOCK pipe. If we take xread() to mean
> "try to read from fd until we get a real error, some data, or an EOF",
> then it is perfectly reasonable to replace spinning on read() (which we
> do now) with a poll() for efficiency. The caller (and the user) does not
> have to care, and should not notice; the outcome will be the same.

I think we are in agreement, and that answers the question/guidance
Stefan asked earlier in $gmane/278414, which was:

> So rather a combination of both, with the warning only spewing every
> 5 seconds or such?

and the answer obviously is "No warning, do a poll() without timeout
to block".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Specifying N revisions after the initial commit

2015-09-22 Thread Junio C Hamano
Josh Boyer  writes:

> On Tue, Sep 22, 2015 at 2:40 PM, Konstantin Khomoutov
> ...
>> Hence, given any particular commit, you're able to trace all of its
>> ancestry, but the reverse is not possible.
>
> That makes sense.  I suppose I will have to resort to parsing output
> of git-rev-list or something.  Thanks for the reminder.

I think Konstantin explained why it fundamentally does not make
sense to ask "which one is the Nth one after the root".  I am not
sure how running rev-list and count its output would help, unless
you are now solving a different problem (perhaps "find all the ones
that are Nth after some root", which does have an answer).

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 06/13] run-command: add an asynchronous parallel child processor

2015-09-22 Thread Junio C Hamano
Stefan Beller  writes:

> So how would you find out when we are done?

while (1) {
if (we have available slot)
ask to start a new one;
if (nobody is running anymore)
break;
collect the output from running ones;
drain the output from the foreground one;
cull the finished process;
}

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 02/13] xread: poll on non blocking fds

2015-09-22 Thread Jeff King
On Tue, Sep 22, 2015 at 12:45:51PM -0700, Junio C Hamano wrote:

> Jacob Keller  writes:
> 
> > I don't think this patch actually changes behavior as it stands now. I
> > think Junio's suggestion does. Personally, I'd prefer some sort of
> > warning when you use xread and get EAGAIN or EWOULDBLOCK. I'd rather
> > see it somehow warn so that we can find the bug (since we really
> > really shouldn't be calling xread with a blocking socket, especially
> > if we have xread_noblock or similar as in this series.
> 
> One caveat is that the caller may not know in the first place.
> 
> The last time I checked the existing callers of xread(), there were
> a few that read from a file descriptor they did not open themselves
> (e.g. unpack-objects that read from standard input).  The invoker of
> these processes is free to do O_NONBLOCK their input stream for
> whatever reason.

Yeah. I do not think this is a bug at all; the user might have their
reasons for handing off an O_NONBLOCK pipe. If we take xread() to mean
"try to read from fd until we get a real error, some data, or an EOF",
then it is perfectly reasonable to replace spinning on read() (which we
do now) with a poll() for efficiency. The caller (and the user) does not
have to care, and should not notice; the outcome will be the same.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 02/13] xread: poll on non blocking fds

2015-09-22 Thread Junio C Hamano
Jacob Keller  writes:

> I don't think this patch actually changes behavior as it stands now. I
> think Junio's suggestion does. Personally, I'd prefer some sort of
> warning when you use xread and get EAGAIN or EWOULDBLOCK. I'd rather
> see it somehow warn so that we can find the bug (since we really
> really shouldn't be calling xread with a blocking socket, especially
> if we have xread_noblock or similar as in this series.

One caveat is that the caller may not know in the first place.

The last time I checked the existing callers of xread(), there were
a few that read from a file descriptor they did not open themselves
(e.g. unpack-objects that read from standard input).  The invoker of
these processes is free to do O_NONBLOCK their input stream for
whatever reason.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 4/4] worktree: add 'list' command

2015-09-22 Thread Junio C Hamano
Michael Rappazzo  writes:

>  
> +--porcelain::
> + With `list`, output in an easy-to-parse format for scripts.
> + This format will remain stable across Git versions and regardless of 
> user
> + configuration.

... and exactly what does it output?  That would be the first
question a reader of this documentation would ask.


> @@ -93,6 +106,7 @@ OPTIONS
>  --expire ::
>   With `prune`, only expire unused working trees older than .
>  
> +

?

> diff --git a/builtin/worktree.c b/builtin/worktree.c
> index 71bb770..e6e36ac 100644
> --- a/builtin/worktree.c
> +++ b/builtin/worktree.c
> @@ -8,10 +8,13 @@
>  #include "run-command.h"
>  #include "sigchain.h"
>  #include "refs.h"
> +#include "utf8.h"
> +#include "worktree.h"
>  
>  static const char * const worktree_usage[] = {
>   N_("git worktree add []  "),
>   N_("git worktree prune []"),
> + N_("git worktree list []"),
>   NULL
>  };
>  
> @@ -359,6 +362,79 @@ static int add(int ac, const char **av, const char 
> *prefix)
>   return add_worktree(path, branch, &opts);
>  }
>  
> +static void show_worktree_porcelain(struct worktree *worktree)
> +{
> + struct strbuf sb = STRBUF_INIT;
> +
> + strbuf_addf(&sb, "worktree %s\n", worktree->path);
> + if (worktree->is_bare)
> + strbuf_addstr(&sb, "bare");
> + else {
> + if (worktree->is_detached)
> + strbuf_addf(&sb, "detached at %s", 
> find_unique_abbrev(worktree->head_sha1, DEFAULT_ABBREV));
> + else
> + strbuf_addf(&sb, "branch %s", 
> shorten_unambiguous_ref(worktree->head_ref, 0));
> + }

Writing the above like this:

if (worktree->is_bare)
...
else if (worktree->is_detached)
...
else
...

would make it a lot more clear that there are three cases.

Also, I doubt --porcelain output wants shorten or abbrev.

Human-readability is not a goal.  Reproducibility is.  When you run
"worktree list" today and save the output, you want the output from
"worktree list" taken tomorrow to exactly match it, even after
creating many objects and tags with conflicting names with branches,
as long as you didn't change their HEADs in the meantime.

> +
> + printf("%s\n\n", sb.buf);
> +
> + strbuf_release(&sb);

I am not sure what the point of use of a strbuf is in this function,
though.  Two printf's for each case (one for the common "worktree
%s", the other inside if/elseif/else cascade) should be sufficient.

> +static void show_worktree(struct worktree *worktree, int path_maxlen)
> +{
> + struct strbuf sb = STRBUF_INIT;
> +

Remove this blank line.  You are still declaring variables.

> + int cur_len = strlen(worktree->path);
> + int utf8_adj = cur_len - utf8_strwidth(worktree->path);

Have a blank line here, instead, as now you start your statements.

> + strbuf_addf(&sb, "%-*s ", 1 + path_maxlen + utf8_adj, worktree->path);
> + if (worktree->is_bare)
> + strbuf_addstr(&sb, "(bare)");
> + else {
> + strbuf_addf(&sb, "%s ", find_unique_abbrev(worktree->head_sha1, 
> DEFAULT_ABBREV));

Personally I am not a big fan of the the alignment and use of
utf8_strwidth(), but by using find_unique_abbrev() here, you are
breaking the alignment, aren't you?  " [branchname]" that follows
the commit object name would not start at the same column, when
you have many objects that default-abbrev is not enough to uniquely
identify them.

And it can easily be fixed by computing the unique-abbrev length for
all the non-bare worktree's HEADs in the same loop you computed
path_maxlen() in the caller, passing that to this function, and use
that as mininum abbrev length when computing the unique-abbrev.

> + if (!worktree->is_detached)
> + strbuf_addf(&sb, "[%s]", 
> shorten_unambiguous_ref(worktree->head_ref, 0));
> + else
> + strbuf_addstr(&sb, "(detached HEAD)");
> + }
> + printf("%s\n", sb.buf);


> diff --git a/t/t2027-worktree-list.sh b/t/t2027-worktree-list.sh
> new file mode 100755
> index 000..b68dfb4
> --- /dev/null
> +++ b/t/t2027-worktree-list.sh
> @@ -0,0 +1,86 @@
> +#!/bin/sh
> +
> +test_description='test git worktree list'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup' '
> + test_commit init
> +'
> +
> +test_expect_success '"list" all worktrees from main' '
> + echo "$(git rev-parse --show-toplevel)   $(git rev-parse --short 
> HEAD) [$(git symbolic-ref --short HEAD)]" >expect &&

Are the number of SPs here significant and if so in what way?  Does
it depend on your environment or will there always be six of them?
Either way feels like an indication of a problem.

> + git worktree add --detach here master &&
> + test_when_finished "rm -rf here && git worktree prune" &&

Aren't these two the other way around?  When "add" fails in the
middle, you would want it to be removed to proc

[PATCH] Fixed typo in Doc/gitrepository-layout

2015-09-22 Thread Lapshin Dmitry
From: LDVSOFT 

gitrepository-layout.txt: In description of `info' directory,
note about `$GIT_COMMON_DIR' was referencing `index'
instead of `info'.

Signed-off-by: LDVSOFT 
---
 Documentation/gitrepository-layout.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/gitrepository-layout.txt 
b/Documentation/gitrepository-layout.txt
index 7173b38..e9c5411 100644
--- a/Documentation/gitrepository-layout.txt
+++ b/Documentation/gitrepository-layout.txt
@@ -177,7 +177,7 @@ sharedindex.::
 info::
Additional information about the repository is recorded
in this directory. This directory is ignored if $GIT_COMMON_DIR
-   is set and "$GIT_COMMON_DIR/index" will be used instead.
+   is set and "$GIT_COMMON_DIR/info" will be used instead.
 
 info/refs::
This file helps dumb transports discover what refs are

--
https://github.com/git/git/pull/177
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/2] git-p4: add test case for "Translation of file content failed" error

2015-09-22 Thread Junio C Hamano
Yup, this was privately reported and I just squashed a fix in right now ;-)

Thanks. "cd t && make test-lint" would have caught it.

On Tue, Sep 22, 2015 at 12:11 PM, Michael Blume  wrote:
> I'm seeing test failures
>
> non-executable tests: t9825-git-p4-handle-utf16-without-bom.sh
>
> ls -l shows that all the other tests are executable but t9825 isn't.
>
> On Tue, Sep 22, 2015 at 9:02 AM, Junio C Hamano  wrote:
>> Lars Schneider  writes:
>>
>>> This works.
>>
>> OK, and thanks; as I don't do perforce, the squash was without any
>> testing.
>>
>>> Do we need the “-e” option?
>>
>> In syntactic sense, no, but our codebase tends to prefer to have
>> one, because it is easier to spot which ones are the instructions if
>> you consistently have "-e" even when you give only one.
>> --
>> To unsubscribe from this list: send the line "unsubscribe git" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/2] git-p4: add test case for "Translation of file content failed" error

2015-09-22 Thread Michael Blume
I'm seeing test failures

non-executable tests: t9825-git-p4-handle-utf16-without-bom.sh

ls -l shows that all the other tests are executable but t9825 isn't.

On Tue, Sep 22, 2015 at 9:02 AM, Junio C Hamano  wrote:
> Lars Schneider  writes:
>
>> This works.
>
> OK, and thanks; as I don't do perforce, the squash was without any
> testing.
>
>> Do we need the “-e” option?
>
> In syntactic sense, no, but our codebase tends to prefer to have
> one, because it is easier to spot which ones are the instructions if
> you consistently have "-e" even when you give only one.
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Specifying N revisions after the initial commit

2015-09-22 Thread Josh Boyer
On Tue, Sep 22, 2015 at 2:40 PM, Konstantin Khomoutov
 wrote:
> On Tue, 22 Sep 2015 14:32:19 -0400
> Josh Boyer  wrote:
>
>> Please CC me as I'm not subscribed.
>>
>> I was hoping someone could help me with the revision shorthand to get
>> the commit sha of a commit N commits after the initial commit.
>
> What happens if right after the initial commit, there have been five
> branches created -- with no common commits except for the initial one?
>
> That's the core limitation of the data model Git uses (and arguably any
> other DVCS system): all commits form a directed acyclic graph.
> The "directed" in that construct means that child commits contain a
> link to their parent commit (or commits) but not vice-versa.

Hm.  It has been so long since I've looked at the underlying model and
git has proven to be so flexible on such a variety of things that I
guess I forgot it was constructed through a DAG.  The --reverse
parameter to git-log and git-rev-parse had left me hopeful.

> Hence, given any particular commit, you're able to trace all of its
> ancestry, but the reverse is not possible.

That makes sense.  I suppose I will have to resort to parsing output
of git-rev-list or something.  Thanks for the reminder.

josh
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] notes: don't expand qualified refs in expand_notes_ref

2015-09-22 Thread Jacob Keller
On Tue, Sep 22, 2015 at 11:37 AM, Junio C Hamano  wrote:
> Jacob Keller  writes:
>
>> The other issue here is that expand_notes_ref is called on the --ref
>> argument waaay before the current code even decides if the operation
>> is "read" or "write". Thus we'd have to break this out and handle
>> things very differently.
>
> I think you hit the right nail here.  The handling of --ref argument
> is what you need to adjust to the new world order.
>
> And "safety" is a red herring.  Those who are used to run
>
> git log --notes=refs/heads/master
>
> and relies on it to refer to refs/notes/refs/heads/master must
> continue to be able to do so.  Changing expand_notes_ref() the
> way the proposed patch does will break them, won't it?  "safety"
> is not the only (or even the primary) requirement, but the
> correctness must also be kept.

I think that's more confusing than helpful, but we really shouldn't
change behavior here. I think we need to update documentation to
clearly indicate the current behavior, as it was not obvious at least
to me. I can submit a patch for this at least.


>
>> It seems like a lot more heavy lifting to change the entire flow of
>> how we decide when to expand "--ref" for "read" operations vs "write"
>> operations.
>>
>> That is, git notes list.
>>
>> It's easy to change behavior of git notes merge as it handles its
>> argument for where to merge from separately, but it's not so easy to
>> change git notes show...
>
> Yes, exactly.
>
> I am _more than_ OK to see that read-only accesses to the notes tree
> allowed anything under refs/ (like the proposed patch did) and also
> a raw 40-hex object name in the endgame, but I really would not want
> to see "we meant well and attempted to enhance 'notes merge' but we
> ended up regressing the behaviour of unrelated operations all of a
> sudden".
>
> If you cannot do your change without breaking the existing users,
> then you at least need a sound transition plan.  But I am not sure
> yet if you have to break the world (yet).  Let me think aloud a bit
> more.
>
> There aren't all that many callers of expand_notes_ref().
>
> My preference is to leave that function as-is, especially keep the
> behaviour of the caller in revision.c that handles --show-notes=
> (and --notes= that is its synonym) the same as before.
>

Ok.

> All the other callers are only reachable from the codepath that
> originates from cmd_notes(), if I am reading the code correctly, and
> that can be enhanced without breaking the existing users.  One
> obvious way to do so would be to make "--ref" to keep the call to
> expand_notes_ref(), and add another "--notes-rawref" or whatever
> option that does not restrict it to "refs/notes", but there may be
> other ways.
>

I think the easiest way would be to have --ref check ahead of time
which commands are read or write, and perform the init_notes_check
code there instead of inside each command. I'll look at this again
when I have time in the next few days.

Regards,
Jake
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git-svn aborts with "Use of uninitialized value $u" when a non-svn-backed branch is present in remote

2015-09-22 Thread Jacob Godserv
I found a specific case in which git-svn improperly aborts:

1. I created a git-svn repository, named "git-svn repo", by cloned an
svn repository via the git-svn tool.
2. I created a normal git repository, named "configuration repo". I
renamed the master branch to "configuration". The initial commit adds
a README and some utility scripts.
3. I created a bare repository, named "master repo".
4. I pushed from the git-svn repo to the master repo.
5. I pushed from the configuration repo to the master repo.

The idea is the configuration branch, which is detached from any
git-svn history, can contain some useful tools, defaults, etc., that I
can share with teammates who want to use git on this svn project. It's
an odd use of git, but it has been working well.

However, a vanilla distribution of Git for Windows 2.5.2 produces the
following error when running any git-svn command, such as "git svn
info", on the cloned master repo:

Use of uninitialized value $u in substitution (s///) at
/mingw64/share/perl5/site_perl/Git/SVN.pm line 105.
Use of uninitialized value $u in concatenation (.) or string at
/mingw64/share/perl5/site_perl/Git/SVN.pm line 105.
refs/remotes/origin/configuration: 'svn+ssh://10.0.1.1/repos/projectA'
not found in ''

In the mentioned SVN.pm file, after the line:

my $u = (::cmt_metadata("$refname"))[0];

I added the following four lines:

if (not defined $u) {
warn "W: $refname does not exist in
SVN; skipping";
next;
}

git-svn appears to operate correctly with this patch. This is my first
time ever editing a perl script, so I apologize if I murdered an
adorable animal just now.

I'm sending this in so more knowledgeable git-svn developers can
comment on this and fix this in the official distribution of git,
assuming there is a bug here to fix.

-- 
Jacob
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] hooks/update: Add a hooks.denyunsignedtags option

2015-09-22 Thread Julian Andres Klode
On Sat, Sep 12, 2015 at 12:37:33PM +0200, Julian Andres Klode wrote:
> Introduce an option to deny unsigned tags from entering
> a repository. This is useful in teams where members forget
> to sign their release tags.
> 
> It does not actually check whether the signature is actually
> complete or valid, it just checks for the beginning of a
> signature, as further checks would be too involved.
> 
> This effectively also denies un-annotated tags, as those
> are unsigned by definition.

No comments?

-- 
Julian Andres Klode  - Debian Developer, Ubuntu Member

See http://wiki.debian.org/JulianAndresKlode and http://jak-linux.org/.

Be friendly, do not top-post, and follow RFC 1855 "Netiquette".
- If you don't I might ignore you.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 02/13] xread: poll on non blocking fds

2015-09-22 Thread Stefan Beller
On Tue, Sep 22, 2015 at 11:21 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> On Tue, Sep 22, 2015 at 8:58 AM, Junio C Hamano  wrote:
>>> Eric Sunshine  writes:
>>>
> while (1) {
> nr = read(fd, buf, len);
> -   if ((nr < 0) && (errno == EAGAIN || errno == EINTR))
> -   continue;
> +   if (nr < 0) {
> +   if (errno == EINTR)
> +   continue;
> +   if (errno == EAGAIN || errno == EWOULDBLOCK) {
> +   struct pollfd pfd;
> +   int i;
> +   pfd.events = POLLIN;
> +   pfd.fd = fd;
> +   i = poll(&pfd, 1, 100);

 Why is this poll() using a timeout? Isn't that still a busy wait of
 sorts (even if less aggressive)?
>>
>> True. Maybe we could have just a warning for now?
>>
>> if (errno == EAGAIN) {
>> warning("Using xread with a non blocking fd");
>> continue; /* preserve previous behavior */
>> }
>
> It is very likely that you will hit the same EAGAIN immediately by
> continuing and end up spewing tons of the same warning, no?

Right.

>
> We may want to slow down and think before sending a knee-jerk
> response (this applies to both of us).

Earlier this year I was doing lots of work in Gerrit, which somehow
prevented crossing emails even though I was knee-jerking all the time.
Maybe I picked up habits in Gerrit land which are only positive in Gerrit
land, while rather harming in a mailing list base workflow such as here
in Git land.

>
> I actually think a blocking poll(2) here would not be such a bad
> idea.  It would preserves the previous behaviour for callers who
> unknowingly inherited a O_NONBLOCK file descriptor from its
> environment without forcing them to waste CPU cycles.

So rather a combination of both, with the warning only spewing every
5 seconds or such?

I mean we identified this as a potential bug which we want to fix eventually
by having the caller make sure they only pass in blocking fds.

I feel like this is similar to the && call chain detection Jeff made a few
months back. At first there were bugreports coming in left and right about
broken test scripts, now it's all fixed hopefully. The difference is
this is not in
the test suite though, so maybe we can rollout this patch early next cycle
and get all the bugs fixed before we get serious with a release again?

It's risky though. Maybe there is a legit use case for "I have a non blocking fd
for $REASONS and I know it, but I still want to read it now until it's done"
Actually I thought about doing that exactly myself as part of the cleanup in the
asynchronous paralllel processing. We'd collect progress information using
this strbuf_read_once and polling. Once a process is done, we would need to get
all its output (i.e. up to EOF), so then we could just call
strbuf_read as we know the
child has already terminated, so all we want is getting the last bytes
out of the pipe.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Specifying N revisions after the initial commit

2015-09-22 Thread Konstantin Khomoutov
On Tue, 22 Sep 2015 14:32:19 -0400
Josh Boyer  wrote:

> Please CC me as I'm not subscribed.
> 
> I was hoping someone could help me with the revision shorthand to get
> the commit sha of a commit N commits after the initial commit.

What happens if right after the initial commit, there have been five
branches created -- with no common commits except for the initial one?

That's the core limitation of the data model Git uses (and arguably any
other DVCS system): all commits form a directed acyclic graph.
The "directed" in that construct means that child commits contain a
link to their parent commit (or commits) but not vice-versa.

Hence, given any particular commit, you're able to trace all of its
ancestry, but the reverse is not possible.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 02/13] xread: poll on non blocking fds

2015-09-22 Thread Torsten Bögershausen
On 22.09.15 08:23, Jacob Keller wrote:
> On Mon, Sep 21, 2015 at 9:55 PM, Torsten Bögershausen  wrote:
>> But in any case I suggest to  xread() as it is, and not to change the
>> functionality
>> behind the back of the users.
>>
>>
> 
> I don't think this patch actually changes behavior as it stands now. I
> think Junio's suggestion does. Personally, I'd prefer some sort of
> warning when you use xread and get EAGAIN or EWOULDBLOCK. I'd rather
> see it somehow warn so that we can find the bug (since we really
> really shouldn't be calling xread with a blocking socket, especially
> if we have xread_noblock or similar as in this series.
> 
> Not sure if we really want to handle that, but I know we don't want to
> change external behavior of xread... I think that polling is better
> than the current "spinning" behavior.
> 
> Regards,
> Jake
Oh sorry for my comment, I mis-read the whole thing completely.

And yes, a warning would be better than a poll()



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] notes: don't expand qualified refs in expand_notes_ref

2015-09-22 Thread Junio C Hamano
Jacob Keller  writes:

> The other issue here is that expand_notes_ref is called on the --ref
> argument waaay before the current code even decides if the operation
> is "read" or "write". Thus we'd have to break this out and handle
> things very differently.

I think you hit the right nail here.  The handling of --ref argument
is what you need to adjust to the new world order.

And "safety" is a red herring.  Those who are used to run

git log --notes=refs/heads/master

and relies on it to refer to refs/notes/refs/heads/master must
continue to be able to do so.  Changing expand_notes_ref() the
way the proposed patch does will break them, won't it?  "safety"
is not the only (or even the primary) requirement, but the
correctness must also be kept.

> It seems like a lot more heavy lifting to change the entire flow of
> how we decide when to expand "--ref" for "read" operations vs "write"
> operations.
>
> That is, git notes list.
>
> It's easy to change behavior of git notes merge as it handles its
> argument for where to merge from separately, but it's not so easy to
> change git notes show...

Yes, exactly.

I am _more than_ OK to see that read-only accesses to the notes tree
allowed anything under refs/ (like the proposed patch did) and also
a raw 40-hex object name in the endgame, but I really would not want
to see "we meant well and attempted to enhance 'notes merge' but we
ended up regressing the behaviour of unrelated operations all of a
sudden".

If you cannot do your change without breaking the existing users,
then you at least need a sound transition plan.  But I am not sure
yet if you have to break the world (yet).  Let me think aloud a bit
more.

There aren't all that many callers of expand_notes_ref().

My preference is to leave that function as-is, especially keep the
behaviour of the caller in revision.c that handles --show-notes=
(and --notes= that is its synonym) the same as before.

All the other callers are only reachable from the codepath that
originates from cmd_notes(), if I am reading the code correctly, and
that can be enhanced without breaking the existing users.  One
obvious way to do so would be to make "--ref" to keep the call to
expand_notes_ref(), and add another "--notes-rawref" or whatever
option that does not restrict it to "refs/notes", but there may be
other ways.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Specifying N revisions after the initial commit

2015-09-22 Thread Josh Boyer
Hi All,

Please CC me as I'm not subscribed.

I was hoping someone could help me with the revision shorthand to get
the commit sha of a commit N commits after the initial commit.  Thus
far I've figured out that to get the initial commit in a repository,
you can use:

git rev-list --max-parents=0 HEAD

but I can't figure out how to get "give me the commit sha1 of the
commit immediately after the initial commit", or for some number N.  I
could always do something like:

git rev-list HEAD | tail -2

to get both, but I was curious if there was a refspec shorthand for
this that could be used.  It seems that git's rev parsing is all built
on going backwards in order (and probably rightfully so).

Thanks in advance.

josh
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 06/13] run-command: add an asynchronous parallel child processor

2015-09-22 Thread Stefan Beller
On Mon, Sep 21, 2015 at 6:08 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> +void default_start_failure(void *data,
>> +struct child_process *cp,
>> +struct strbuf *err)
>> +{
>> + int i;
>> + struct strbuf sb = STRBUF_INIT;
>> +
>> + for (i = 0; cp->argv[i]; i++)
>> + strbuf_addf(&sb, "%s ", cp->argv[i]);
>> + die_errno("Starting a child failed:\n%s", sb.buf);
>
> Do we want that trailing SP after the last element of argv[]?
> Same question applies to the one in "return-value".

done

>
>> +static void run_processes_parallel_init(struct parallel_processes *pp,
>> + int n, void *data,
>> + get_next_task_fn get_next_task,
>> + start_failure_fn start_failure,
>> + return_value_fn return_value)
>> +{
>> + int i;
>> +
>> + if (n < 1)
>> + n = online_cpus();
>> +
>> + pp->max_processes = n;
>> + pp->data = data;
>> + if (!get_next_task)
>> + die("BUG: you need to specify a get_next_task function");
>> + pp->get_next_task = get_next_task;
>> +
>> + pp->start_failure = start_failure ? start_failure : 
>> default_start_failure;
>> + pp->return_value = return_value ? return_value : default_return_value;
>
> I would actually have expected that leaving these to NULL will just
> skip pp->fn calls, instead of a "default implementation", but a pair
> of very simple default implementation would not hrtut.

Ok, I think the default implementation provided is a reasonable default, as
it provides enough information in case of an error.

>
>> +static void run_processes_parallel_cleanup(struct parallel_processes *pp)
>> +{
>> + int i;
>
> Have a blank between the decl block and the first stmt here (and
> elsewhere, too---which you got correct in the function above)?

done

>
>> + for (i = 0; i < pp->max_processes; i++)
>> + strbuf_release(&pp->children[i].err);
>
>> +static void run_processes_parallel_start_one(struct parallel_processes *pp)
>> +{
>> + int i;
>> +
>> + for (i = 0; i < pp->max_processes; i++)
>> + if (!pp->children[i].in_use)
>> + break;
>> + if (i == pp->max_processes)
>> + die("BUG: bookkeeping is hard");
>
> Mental note: the caller is responsible for not calling this when all
> slots are taken.
>
>> + if (!pp->get_next_task(pp->data,
>> +&pp->children[i].process,
>> +&pp->children[i].err)) {
>> + pp->all_tasks_started = 1;
>> + return;
>> + }
>
> Mental note: but it is OK to call this if get_next_task() previously
> said "no more task".
>
> The above two shows a slight discrepancy (nothing earth-breaking).

I see. Maybe this can be improved by having the
run_processes_parallel_start_as_needed call get_next_task
and pass the information into the run_processes_parallel_start_one
or as we had it before, combine these two functions again.

>
> I have this suspicion that the all-tasks-started bit may turn out to
> be a big mistake that we may later regret.  Don't we want to allow
> pp->more_task() to say "no more task to run at this moment" implying
> "but please do ask me later, because I may have found more to do by
> the time you ask me again"?

And this task would arise because the current running children produce
more work to be done?
So you would have a
more_tasks() question. If that returns true
get_next_task() must provide that next task?

In case we had more work to do, which is based on the outcome of the
children, we could just wait in get_next_task for a semaphore/condition
variable from the return_value. Though that would stop progress reporting
end maybe lock up the whole program due to pipe clogging.

It seems to be a better design as we come back to the main loop fast
which does the polling. Although I feel like it is over engineered for now.

So how would you find out when we are done?
* more_tasks() could have different return values in an enum
  (YES_THERE_ARE, NO_BUT_ASK_LATER, NO_NEVER_ASK_AGAIN)
* There could be yet another callback more_tasks_available() and
   parallel_processing_should_stop()
* Hand back a callback ourselfs [Call signal_parallel_processing_done(void*)
  when more_tasks will never return true again, with a void* we provide to
  more_tasks()]
* ...

>
> That is one of the reasons why I do not think the "very top level is
> a bulleted list" organization is a good idea in general.  A good
> scheduling decision can seldom be made in isolation without taking
> global picture into account.
>
>> +static void run_processes_parallel_collect_finished(struct 
>> parallel_processes *pp)
>> +{
>> + int i = 0;
>> + pid_t pid;
>> + int wait_status, code;
>> + int n = pp->max_processes;
>> +
>> + while (pp->nr_processes

Re: [PATCHv3 02/13] xread: poll on non blocking fds

2015-09-22 Thread Junio C Hamano
Stefan Beller  writes:

> On Tue, Sep 22, 2015 at 8:58 AM, Junio C Hamano  wrote:
>> Eric Sunshine  writes:
>>
 while (1) {
 nr = read(fd, buf, len);
 -   if ((nr < 0) && (errno == EAGAIN || errno == EINTR))
 -   continue;
 +   if (nr < 0) {
 +   if (errno == EINTR)
 +   continue;
 +   if (errno == EAGAIN || errno == EWOULDBLOCK) {
 +   struct pollfd pfd;
 +   int i;
 +   pfd.events = POLLIN;
 +   pfd.fd = fd;
 +   i = poll(&pfd, 1, 100);
>>>
>>> Why is this poll() using a timeout? Isn't that still a busy wait of
>>> sorts (even if less aggressive)?
>
> True. Maybe we could have just a warning for now?
>
> if (errno == EAGAIN) {
> warning("Using xread with a non blocking fd");
> continue; /* preserve previous behavior */
> }

It is very likely that you will hit the same EAGAIN immediately by
continuing and end up spewing tons of the same warning, no?

We may want to slow down and think before sending a knee-jerk
response (this applies to both of us).

I actually think a blocking poll(2) here would not be such a bad
idea.  It would preserves the previous behaviour for callers who
unknowingly inherited a O_NONBLOCK file descriptor from its
environment without forcing them to waste CPU cycles.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 3/4] worktree: add functions to get worktree details

2015-09-22 Thread Junio C Hamano
Michael Rappazzo  writes:

> +/**
> + * get the main worktree
> + */
> +static struct worktree *get_main_worktree()

static struct worktree *get_main_worktree(void)

> +{
> + struct worktree *worktree = NULL;
>   struct strbuf path = STRBUF_INIT;
> + struct strbuf worktree_path = STRBUF_INIT;
> + struct strbuf git_dir = STRBUF_INIT;
> + struct strbuf head_ref = STRBUF_INIT;
> + int is_bare = 0;
> + int is_detached = 0;
>  
> + strbuf_addf(&git_dir, "%s", absolute_path(get_git_common_dir()));
> + strbuf_addf(&worktree_path, "%s", absolute_path(get_git_common_dir()));

Why not strbuf_add(&git_dir, absolute_path(get_git_common_dir()))
followed by strbuf_addbuf(&git_dir, &worktree_path)?

> + is_bare = !strbuf_strip_suffix(&worktree_path, "/.git");
> + if (is_bare)
> + strbuf_strip_suffix(&worktree_path, "/.");

Hmm, it is not immediately obvious which part is meant to be a
faithful translations from the old find_main_symref() that grab the
same kind of information the old code used to discover (and record
in a new mechanism that is "struct worktree"), and which part is a
new code that is needed to grab new pieces of information.  The
effort in [PATCH 2/4] to make it easier to follow sort of lost by
this rewrite.

I presume that discovery of the git_dir is from the original and
everything else including is_bare bit, head_ref, is_detached are
new?

Splitting this patch further into two may help showing the changes
better.  First move to the "grab information into an element of the
worktree array" code structure without adding new information, and
then "now make the worktree structure richer" as a follow up,
perhaps?

> +/**
> + * get the estimated worktree count for use in sizing the worktree array
> + * Note that the minimum count should be 2 (main worktree + NULL).  Using the
> + * file count in $GIT_COMMON_DIR/worktrees includes '.' and '..' so the
> + * minimum is satisfied by counting those entries.
> + */
> +static int get_estimated_worktree_count()
> +{
> +...
>  }

This is new.  My gut-feeling is that we would probably not want to
do this (see below).

> +struct worktree **get_worktrees(void)
>  {
> + struct worktree **list = NULL;
>   struct strbuf path = STRBUF_INIT;
>   DIR *dir;
>   struct dirent *d;
> - char *existing;
> + int counter = 0;
> +
> + list = xmalloc(get_estimated_worktree_count() * sizeof(struct worktree 
> *));

Here you scanned the directory to see how many possible worktrees
there are "at this moment".

Time passes.

How do you know that nobody added more worktrees in the meantime?
You don't.

>  
> - if ((existing = find_main_symref(symref, target)))
> - return existing;
> + if ((list[counter] = get_main_worktree()))
> + counter++;
>  
>   strbuf_addf(&path, "%s/worktrees", get_git_common_dir());
>   dir = opendir(path.buf);
>   strbuf_release(&path);
> + if (dir) {
> + while ((d = readdir(dir)) != NULL) {
> + struct worktree *linked = NULL;
> + if (!strcmp(d->d_name, ".") || !strcmp(d->d_name, ".."))
> + continue;
> +
> + if ((linked = get_linked_worktree(d->d_name))) {
> + list[counter++] = linked;

Which means that nothing stops you from overrunning the list[] with
this iteration.

By the way, when the body of "if" and "else" both consists of a
single statement, we tend to drop braces.

> + }

And in order to avoid overrunning, you would need to do the usual
ALLOC_GROW() dance on list[] _anyway_.  So there is no much point,
other than to optimize for a case where you have thousands of
worktrees and the usual ALLOC_GROW() approach would have to do
realloc(3) too many times, to have the initial "guestimate" scan.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] notes: don't expand qualified refs in expand_notes_ref

2015-09-22 Thread Jacob Keller
On Tue, Sep 22, 2015 at 7:17 AM, Junio C Hamano  wrote:
> Calls expand_notes_ref() made on a command line argument that
> specifies the source (which I think is similar to what the other
> recent topic calls "read-only") should be made to calls to a more
> lenient version (and you can start with get_sha1() for that purpose
> without introducing your own DWIMs in the first step), while leaving
> calls to expand_notes_ref() for destination and the implementation
> of expand_notes_ref() itself unmolested, so that we can keep the
> safety in expands_notes_ref() that makes sure that the destination
> of a local operation is under refs/notes/*.

The other issue here is that expand_notes_ref is called on the --ref
argument waaay before the current code even decides if the operation
is "read" or "write". Thus we'd have to break this out and handle
things very differently.

I don't believe expand_notes_ref() is actually providing any safety.
That is all done by init_notes_check. Note that passing the
environment variable bypasses the expand_notes_ref entirely, though I
think we did use it as part of the refs configuration.

It seems like a lot more heavy lifting to change the entire flow of
how we decide when to expand "--ref" for "read" operations vs "write"
operations.

That is, git notes list.

It's easy to change behavior of git notes merge as it handles its
argument for where to merge from separately, but it's not so easy to
change git notes show...

Regards,
Jake
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 2/4] worktree: refactor find_linked_symref function

2015-09-22 Thread Junio C Hamano
Michael Rappazzo  writes:

> Refactoring will help transition this code to provide additional useful
> worktree functions.
>
> Signed-off-by: Michael Rappazzo 
> ---
>  worktree.c | 86 
> ++
>  1 file changed, 64 insertions(+), 22 deletions(-)
>
> diff --git a/worktree.c b/worktree.c
> index 10e1496..5c75875 100644
> --- a/worktree.c
> +++ b/worktree.c
> @@ -3,6 +3,60 @@
>  #include "strbuf.h"
>  #include "worktree.h"
>  
> +/*
> + * read 'path_to_ref' into 'ref'.  Also if is_detached is not NULL,
> + * set is_detached to 1 if the ref is detatched.

set is_detached to 1 (0) if the ref is detatched (is not detached).

> + *
> + * $GIT_COMMON_DIR/$symref (e.g. HEAD) is practically outside $GIT_DIR so
> + * for linked worktrees, `resolve_ref_unsafe()` won't work (it uses
> + * git_path). Parse the ref ourselves.
> + *
> + * return -1 if the ref is not a proper ref, 0 otherwise (success)
> + */
> +static int parse_ref(char *path_to_ref, struct strbuf *ref, int *is_detached)
> +{
> + if (is_detached)
> + *is_detached = 0;
> + if (!strbuf_readlink(ref, path_to_ref, 0)) {
> + if (!starts_with(ref->buf, "refs/")
> + || check_refname_format(ref->buf, 0))

Don't try to be creative with the format and stick to the original.

if (!starts_with(ref->buf, "refs/") ||
check_refname_format(ref->buf, 0))

> + return -1;
> +

This blank makes a strange code by making the "return -1" have no
blank above and one blank below.

> + } else if (strbuf_read_file(ref, path_to_ref, 0) >= 0) {
> + if (starts_with(ref->buf, "ref:")) {
> + strbuf_remove(ref, 0, strlen("ref:"));
> + strbuf_trim(ref);
> + if (check_refname_format(ref->buf, 0))
> + return -1;
> + } else if (is_detached)
> + *is_detached = 1;

Minor: I have a suspicion that this would be easier to follow:

if (!starts_with(...)) {
if (is_detached)
*is_detached = 1;
} else {
strbuf_remove(...);
strbuf_trim(...);
if (check_refname_format(...))
return -1;
}

> + }

What should happen when strbuf_read_file() above fails?  Would it be
a bug (i.e. the caller shouldn't have called us in the first place
with such a broken ref), would it be a repository inconsistency
(i.e. it is worth warning and stop the caller from doing further
damage), or is it OK to silently succeed?

> + return 0;
> +}
> +
> +static char *find_main_symref(const char *symref, const char *branch)
> +{
> + struct strbuf sb = STRBUF_INIT;
> + struct strbuf path = STRBUF_INIT;
> + struct strbuf gitdir = STRBUF_INIT;
> + char *existing = NULL;
> +
> + strbuf_addf(&path, "%s/%s", get_git_common_dir(), symref);
> + if (parse_ref(path.buf, &sb, NULL) == -1)
> + goto done;

I know you described it to "return -1 on an error", but as a general
style, for a function that signals a success by returning 0 and
negative on error (or on various kinds of errors), it is easier to
follow if you followed a more common pattern:

if (parse_ref(...) < 0)
goto done;

> + if (strcmp(sb.buf, branch))
> + goto done;
> + strbuf_addstr(&gitdir, get_git_common_dir());
> + strbuf_strip_suffix(&gitdir, ".git");
> + existing = strbuf_detach(&gitdir, NULL);
> +done:
> + strbuf_release(&path);
> + strbuf_release(&sb);
> + strbuf_release(&gitdir);
> +
> + return existing;
> +}
> +
>  static char *find_linked_symref(const char *symref, const char *branch,
>   const char *id)
>  {
> @@ -11,36 +65,24 @@ static char *find_linked_symref(const char *symref, const 
> char *branch,
>   struct strbuf gitdir = STRBUF_INIT;
>   char *existing = NULL;
>  
> + if (!id)
> + goto done;

A caller that calls this function with id==NULL would always get a
no-op.  Is that what the caller intended, or should it have called
the new find_main_symref() instead?  I'd imagine it is the latter,
in which case

if (!id)
die("BUG");

is more appropriate.  On the other hand, if you are trying to keep
the old interface to allow id==NULL to mean "get the information for
the primary one", I'd expect this to call find_main_symref() for the
(old-style) callers.

In either case, this "no id? no-op" looks funny.

>   /*
>* $GIT_COMMON_DIR/$symref (e.g. HEAD) is practically outside
>* $GIT_DIR so resolve_ref_unsafe() won't work (it uses
>* git_path). Parse the ref ourselves.
>*/

You moved this comment to parse_ref(), which is a more appropriate
home for it.  Perhaps you

Re: [PATCHv3 02/13] xread: poll on non blocking fds

2015-09-22 Thread Stefan Beller
On Tue, Sep 22, 2015 at 8:58 AM, Junio C Hamano  wrote:
> Eric Sunshine  writes:
>
>>> while (1) {
>>> nr = read(fd, buf, len);
>>> -   if ((nr < 0) && (errno == EAGAIN || errno == EINTR))
>>> -   continue;
>>> +   if (nr < 0) {
>>> +   if (errno == EINTR)
>>> +   continue;
>>> +   if (errno == EAGAIN || errno == EWOULDBLOCK) {
>>> +   struct pollfd pfd;
>>> +   int i;
>>> +   pfd.events = POLLIN;
>>> +   pfd.fd = fd;
>>> +   i = poll(&pfd, 1, 100);
>>
>> Why is this poll() using a timeout? Isn't that still a busy wait of
>> sorts (even if less aggressive)?
>

True. Maybe we could have just a warning for now?

if (errno == EAGAIN) {
warning("Using xread with a non blocking fd");
continue; /* preserve previous behavior */
}

I think I am going to drop this patch off the main series and spin it out
as an extra patch as the discussion is a bit unclear to me at the moment
where we're heading.

> Good point.  If we _were_ to have this kind of "hiding issues under
> the rug and continuing without issues" approach, I do not think we
> would need timeout for this poll(2).  The caller accepted that it is
> willing to wait until we read up to len (which is capped, though) by
> not calling the nonblocking variant.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 1/4] worktree: add top-level worktree.c

2015-09-22 Thread Junio C Hamano
Michael Rappazzo  writes:

> worktree.c contains functions to work with and get information from
> worktrees.  This introduction moves functions related to worktrees
> from branch.c into worktree.c
>
> Signed-off-by: Michael Rappazzo 
> ---
>  Makefile|  1 +
>  branch.c| 79 +-
>  branch.h|  8 --
>  builtin/notes.c |  1 +
>  worktree.c  | 82 
> +
>  worktree.h  | 12 +
>  6 files changed, 97 insertions(+), 86 deletions(-)
>  create mode 100644 worktree.c
>  create mode 100644 worktree.h
> ...
> diff --git a/branch.h b/branch.h
> index d3446ed..58aa45f 100644
> --- a/branch.h
> +++ b/branch.h
> @@ -59,12 +59,4 @@ extern int read_branch_desc(struct strbuf *, const char 
> *branch_name);
>   */
>  extern void die_if_checked_out(const char *branch);
>  
> -/*
> - * Check if a per-worktree symref points to a ref in the main worktree
> - * or any linked worktree, and return the path to the exising worktree
> - * if it is.  Returns NULL if there is no existing ref.  The caller is
> - * responsible for freeing the returned path.
> - */
> -extern char *find_shared_symref(const char *symref, const char *target);
> -
>  #endif
> diff --git a/builtin/notes.c b/builtin/notes.c
> index 3608c64..8b30334 100644
> --- a/builtin/notes.c
> +++ b/builtin/notes.c
> @@ -20,6 +20,7 @@
>  #include "notes-merge.h"
>  #include "notes-utils.h"
>  #include "branch.h"
> +#include "worktree.h"

I think you no longer need to include branch.h after this change.

Other than that, this step looks uncontroversial.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 07/13] fetch_populated_submodules: use new parallel job processing

2015-09-22 Thread Junio C Hamano
Stefan Beller  writes:

> In a later patch we enable parallel processing of submodules, this
> only adds the possibility for it. So this change should not change
> any user facing behavior.
>
> Signed-off-by: Stefan Beller 
> ---
>  builtin/fetch.c |   3 +-
>  submodule.c | 119 
> +++-
>  submodule.h |   2 +-
>  3 files changed, 87 insertions(+), 37 deletions(-)
>
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index ee1f1a9..6620ed0 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1217,7 +1217,8 @@ int cmd_fetch(int argc, const char **argv, const char 
> *prefix)
>   result = fetch_populated_submodules(&options,
>   submodule_prefix,
>   recurse_submodules,
> - verbosity < 0);
> + verbosity < 0,
> + 0);

This one...

>  int fetch_populated_submodules(const struct argv_array *options,
>  const char *prefix, int command_line_option,
> -int quiet)
> +int quiet, int max_parallel_jobs)
>  {
> - int i, result = 0;
> - struct child_process cp = CHILD_PROCESS_INIT;

... together with this one, could have been made easier to follow if
you didn't add a new parameter to the function.  Instead, you could
define a local variable max_parallel_jobs initialized to 1 in this
function without changing the function signature (and the caller) in
this step, and then did the above two changes in the same commit as
the one that actually enables the feature.

That would be more in line with the stated "only adds the possiblity
for it" goal.

As posted, I suspect that by passing 0 to max_parallel_jobs, you are
telling run_processes_parallel_init() to check online_cpus() and run
that many processes in parallel, no?

> +int get_next_submodule(void *data, struct child_process *cp,
> +struct strbuf *err)
> +{
> + int ret = 0;
> + struct submodule_parallel_fetch *spf = data;
> ...
> + child_process_init(cp);
> + cp->dir = strbuf_detach(&submodule_path, NULL);
> + cp->git_cmd = 1;
> + cp->no_stdout = 1;
> + cp->no_stdin = 1;

In run-commands.c::start_command(), no_stdout triggers
dup_devnull(1) and causes dup2(2, 1) that is triggered by
stdout_to_stderrd to be bypassed.  This looks wrong to me.

> + cp->stdout_to_stderr = 1;
> + cp->err = -1;

OK, the original left the standard error stream connected to the
invoker's standard error, but now we are capturing it via a pipe.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/2] git-p4: add test case for "Translation of file content failed" error

2015-09-22 Thread Junio C Hamano
Lars Schneider  writes:

> If it is no extra work for you, you can remove the quotes around
> “db”. I can also create a new patch roll including the sed change
> and the quote change if it is easier for you.

Now you've tested the SQUASH??? for me, I can just squash that into
your original without resend.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] filter-branch: add passed/remaining seconds on progress

2015-09-22 Thread Junio C Hamano
Gabor Bernat  writes:

> On Mon, Sep 21, 2015 at 11:24 PM, Gábor Bernát
> ...
>> Agreed, :) did not abandoned this, just got caught up with many stuff.
>> Thanks for the help,
>
> So do I need to do anything else with this? :)

If you can fetch from me to see if the output from

git log -p origin/master..71400d97b12a

looks sensible, that would be good.  There are two commits.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/2] git-p4: add test case for "Translation of file content failed" error

2015-09-22 Thread Junio C Hamano
Lars Schneider  writes:

> This works.

OK, and thanks; as I don't do perforce, the squash was without any
testing.

> Do we need the “-e” option?

In syntactic sense, no, but our codebase tends to prefer to have
one, because it is easier to spot which ones are the instructions if
you consistently have "-e" even when you give only one.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git-svn: Why not use git-fast-import?

2015-09-22 Thread Jeff King
On Tue, Sep 22, 2015 at 02:12:59AM -0700, Victor Leschuk wrote:

> I've been playing with git-svn for some time already and as it seems
> to me there are two most important problems which make it hard to use
> in production for large repositories. Very low speed and large memory
> footprint of synchronization with SVN repos (I am talking about clone
> and fetch operations mostly). I was wondering why the git-fast-import
> is not used for these purposes? Are there any serious limitations
> which make it impossible?

The main reason is that git-svn predates the invention of fast-import.
There was an attempt later to make a more "modern" svn interface:

  - it would provide a git-remote-helper interface (so you could use
normal clone, fetch, and push with an "svn::" URL rather than a
separate git-svn).

  - it would use fast-import for moving data into git

but it was never finished. I don't recall the specific problems offhand.
You can see the remnants in the vcs-svn directory of git, or you might
find discussions by searching the list archive.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 02/13] xread: poll on non blocking fds

2015-09-22 Thread Junio C Hamano
Eric Sunshine  writes:

>> while (1) {
>> nr = read(fd, buf, len);
>> -   if ((nr < 0) && (errno == EAGAIN || errno == EINTR))
>> -   continue;
>> +   if (nr < 0) {
>> +   if (errno == EINTR)
>> +   continue;
>> +   if (errno == EAGAIN || errno == EWOULDBLOCK) {
>> +   struct pollfd pfd;
>> +   int i;
>> +   pfd.events = POLLIN;
>> +   pfd.fd = fd;
>> +   i = poll(&pfd, 1, 100);
>
> Why is this poll() using a timeout? Isn't that still a busy wait of
> sorts (even if less aggressive)?

Good point.  If we _were_ to have this kind of "hiding issues under
the rug and continuing without issues" approach, I do not think we
would need timeout for this poll(2).  The caller accepted that it is
willing to wait until we read up to len (which is capped, though) by
not calling the nonblocking variant.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 03/13] xread_nonblock: add functionality to read from fds nonblockingly

2015-09-22 Thread Junio C Hamano
Jacob Keller  writes:

> On Mon, Sep 21, 2015 at 3:39 PM, Stefan Beller  wrote:
>
> Maybe change the title to "without blocking" instead of "nonblockingly"?

Both suggestions make sense ;-)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] filter-branch: add passed/remaining seconds on progress

2015-09-22 Thread Gabor Bernat
On Mon, Sep 21, 2015 at 11:24 PM, Gábor Bernát
 wrote:
> On Mon, Sep 21, 2015 at 11:22 PM, Eric Sunshine 
> wrote:
>>
>> On Mon, Sep 21, 2015 at 3:52 PM, Junio C Hamano  wrote:
>> > Eric Sunshine  writes:
>> >> On Tue, Sep 8, 2015 at 1:32 PM, Junio C Hamano 
>> >> wrote:
>> >>> Eric Sunshine  writes:
>>  On Mon, Sep 7, 2015 at 9:52 AM, Gábor Bernát 
>>  wrote:
>> ...
>> >  # Rewrite the commits
>> > +report_progress ()
>> > +{
>> > +if test -n "$progress"
>> > +then
>> 
>>  Indent code within the function...
>> >>>
>> > +printf "\rRewrite $commit
>> > ($git_filter_branch__commit_count/$commits)$progress"
>> 
>>  The "\r" causes this status line to be overwritten each time through,
>>  and since the processed commit count always increases, we know that
>>  the original (without ETA) will never leave junk at the end of the
>>  line. However, with estimated seconds also being displayed, does this
>>  still hold?
>> >>>
>> >>> Good point.
>> >>> Perhaps like this squashed in?
>> >>>
>> >>> -printf "\rRewrite $commit
>> >>> ($git_filter_branch__commit_count/$commits)$progress"
>> >>> + printf "\rRewrite $commit
>> >>> ($git_filter_branch__commit_count/$commits)$progress "
>> >>
>> >> Yes, for an expedient "fix", this is what I had in mind, although I
>> >> would also have added an equal number of backspaces (\b) following the
>> >> spaces, as a minor aesthetic improvement.
>> >
>> > This topic seems to have stalled.  I do not want to discard topics
>> > because that means all the effort we spent to review and polish the
>> > patch so far gets wasted, but we cannot leave unfinished topics
>> > linger for too long.
>> >
>> > For now, I'll queue this SQUASH??? on top as a minimum fix (renaming
>> > of variables and other things noticed during the review may be worth
>> > doing, but they are not as grave as the issues this fixes, which are
>> > show stoppers).
>>
>> Looks like a reasonable squash for moving this topic forward. Thanks.
>>
>> > I do not think our in-core progress code does that (and we do not
>> > use ESC[0K either), so I'll leave it out of the minimum fix.
>> >
>> > diff --git a/git-filter-branch.sh b/git-filter-branch.sh
>> > index 565144a..71102d5 100755
>> > --- a/git-filter-branch.sh
>> > +++ b/git-filter-branch.sh
>> > @@ -277,9 +277,8 @@ test $commits -eq 0 && die "Found nothing to
>> > rewrite"
>> >  # Rewrite the commits
>> >  report_progress ()
>> >  {
>> > -if test -n "$progress"
>> > -then
>> > -   if test $git_filter_branch__commit_count -gt $next_sample_at
>> > +   if test -n "$progress" &&
>> > +   test $git_filter_branch__commit_count -gt
>> > $next_sample_at
>> > then
>> > now_timestamp=$(date +%s)
>> > elapsed_seconds=$(($now_timestamp - $start_timestamp))
>> > @@ -292,8 +291,7 @@ then
>> > fi
>> > progress=" ($elapsed_seconds seconds passed, remaining
>> > $remaining_second predicted)"
>> > fi
>> > -fi
>> > -printf "\rRewrite $commit
>> > ($git_filter_branch__commit_count/$commits)$progress"
>> > +   printf "\rRewrite $commit
>> > ($git_filter_branch__commit_count/$commits)$progress"
>> >  }
>> >
>> >  git_filter_branch__commit_count=0
>> > --
>> > 2.6.0-rc2-220-gd6fe230
>
>
> Agreed, :) did not abandoned this, just got caught up with many stuff.
> Thanks for the help,
>

So do I need to do anything else with this? :)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 09/13] submodule config: keep update strategy around

2015-09-22 Thread Stefan Beller
On Mon, Sep 21, 2015 at 5:56 PM, Eric Sunshine  wrote:
> On Mon, Sep 21, 2015 at 6:39 PM, Stefan Beller  wrote:
>> We need the submodule update strategies in a later patch.
>>
>> Signed-off-by: Stefan Beller 
>> ---
>> diff --git a/submodule-config.c b/submodule-config.c
>> @@ -326,6 +327,21 @@ static int parse_config(const char *var, const char 
>> *value, void *data)
>> free((void *) submodule->url);
>> strbuf_addstr(&url, value);
>> submodule->url = strbuf_detach(&url, NULL);
>> +   } else if (!strcmp(item.buf, "update")) {
>> +   struct strbuf update = STRBUF_INIT;
>> +   if (!value) {
>> +   ret = config_error_nonbool(var);
>> +   goto release_return;
>> +   }
>> +   if (!me->overwrite && submodule->update != NULL) {
>> +   warn_multiple_config(me->commit_sha1, 
>> submodule->name,
>> +   "update");
>> +   goto release_return;
>> +   }
>> +
>> +   free((void *) submodule->update);
>> +   strbuf_addstr(&update, value);
>> +   submodule->update = strbuf_detach(&update, NULL);
>> }
>>
>>  release_return:
>
> Why the complicated logic flow? Also, why is strbuf 'update' needed?

To be honest, I just copied it from above, where the url is read using
the exact same workflow. In the reroll I'll fix both.

>
> I'd have expected to see something simpler, such as:
>
> } else if (!strcmp(item.buf, "update")) {
> if (!value)
> ret = config_error_nonbool(var);
> else if (!me->overwrite && ...)
> warn_multiple_config(...);
> else {
> free((void *)submodule->update);
> submodule->update = xstrdup(value);
> }
> }
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] notes: don't expand qualified refs in expand_notes_ref

2015-09-22 Thread Jacob Keller
On Tue, Sep 22, 2015 at 7:17 AM, Junio C Hamano  wrote:
> The current code before your patch limits the allowed pair of notes
> trees by insisting that both appear as the tips of refs somewhere in
> refs/notes/*.  For allowing to merge from outside refs/notes/, you
> need to loosen the location the latter notes tree, the one to update
> your local notes tree with, can come from.  But that does not mean
> you would want to loosen the location where the resulting notes tree
> can go.

I do not loosen where the written location can go. My patch series
sort of does the opposite of where you suggest. I change
expand_notes_ref to be more "loose" but not still keep the current
checks in place which ensure that refs outside of refs/notes don't
write to them.

>
> I think the proposed patch conflates them, and that conflation does
> not help anything.  The rule of that function used to be "It must
> come from refs/notes/ and nowhere else".  That made sense in the old
> world order where both must be from refs/notes/, and the rule still
> makes sense in the new world order for the destination of the merge.
>

Yes, I don't change the destination rule, though I do change how we
can "DWIM" so that "attempting" to merge into "refs/balagrg/foo" does
not actually merge into "refs/notes/refs/balagrg/foo". I think this is
sane.

The actual "init_notes_check" code prevents merging into refs outside
of refs/notes/*

I think it's incredibly confusing that other DWIM's do not expand
"refs/*" into "refs/something/refs/*"

> The new rule of that function after the proposed patch says "It must
> come from either refs/notes or refs/ somewhere".  This does not make
> sense for the destination because it is too loose (and we didn't see
> any justification why loosening it is a good idea), and it does not
> make sense for the source because it still is too tight.  It should
> be able to take anything get_sha1() understands (including $THEIRS
> in the above example).

Agreed, we should use something more expansive for the non-write
operations. I still think the proposed change to expand_notes_refs
would be good... but that's because I find it incredibly confusing.
Apparently we disagree on that.

>
> In addition you might also want to allow additional DWIMs from X to
> refs/remote-notes/*/X as well as from X to refs/notes/X, but that is
> secondary and should be done as a follow-up "nice to have" change,
> because both "notes/remote-notes/origin/commits" and "notes/commits"
> would be understood by get_sha1() already, and it is questinable if
> it is a good idea to introduce special DWIMs that kick in only when
> the commands are talking about notes in the first place (an equally
> questionable alternative is to teach get_sha1() about refs/notes/*
> and refs/remote-notes/*/*, which will make the disambiguation noisy
> in the normal non-notes codepath---my knee-jerk reaction is to
> suggest not to go there, either).

The DWIM's I suggest are "foo" -> "refs/notes/foo", "notes/foo" ->
"refs/notes/foo", as these already work, in addition to "/foo"
-> "refs/remote-notes//foo" and "/notes/foo" ->
"refs/remote-notes//foo" Obviously only kicking in for notes
references.

>
> In any case, to get us closer to that endgame, change in the
> proposed patch does not help.  It tries to cover two different cases
> with a logic that is not a good match to either.  You need to have
> two separate helpers to interpret the source and the destination.
>

My patch does the opposite of your suggestion: Loosen expand_notes_ref
while keeping the same overall restrictions, which has the same result
if being considered a little backward.

> Calls expand_notes_ref() made on a command line argument that
> specifies the source (which I think is similar to what the other
> recent topic calls "read-only") should be made to calls to a more
> lenient version (and you can start with get_sha1() for that purpose
> without introducing your own DWIMs in the first step), while leaving
> calls to expand_notes_ref() for destination and the implementation
> of expand_notes_ref() itself unmolested, so that we can keep the
> safety in expands_notes_ref() that makes sure that the destination
> of a local operation is under refs/notes/*.

expand_notes_ref doesn't actually provide the safety net. You can
bypass this entirely by using the command line, at least for the
non-write operations. In addition, documentation says "non qualified
refs will be expanded" which usually means "refs/*" will be left
alone, but in this case "refs/*" outside of refs/notes will be
expanded.

Even with just this patch the only difference is an attempt to use
refs/foo/bad *won't* expand into "refs/notes/refs/foo/bar" which is an
expansion I find incredibly confusing to begin with.

Regards,
Jake
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] notes: don't expand qualified refs in expand_notes_ref

2015-09-22 Thread Junio C Hamano
Jacob Keller  writes:

> I never got any better suggestion on how to allow the behavior
> desired, which is to enable merging from a non-notes location, in
> order to provide a standard location for remote notes, ie:
> refs/remote-notes//

Step back a bit and think again.  I think you are blinded by your
refs/remote-notes/*.

It is fine to wish that

$ notes merge refs/notes/commits refs/remote-notes/origin/commits

to work, but you shouldn't force your users to use remote-tracking
refs in the first place.  Your users should be allowed to do this as
well:

$ fetch origin refs/notes/commits
$ THEIRS=$(git rev-parse FETCH_HEAD)
$ notes merge refs/notes/commits $THEIRS

We need to realize that "notes merge" involves two notes trees and
they are of different nature.  The notes tree you are merging into
and recording the result (the destination), which will be a local
note, and the other notes tree you obtained from elsewhere and
update that local note with (the source).

The current code before your patch limits the allowed pair of notes
trees by insisting that both appear as the tips of refs somewhere in
refs/notes/*.  For allowing to merge from outside refs/notes/, you
need to loosen the location the latter notes tree, the one to update
your local notes tree with, can come from.  But that does not mean
you would want to loosen the location where the resulting notes tree
can go.

I think the proposed patch conflates them, and that conflation does
not help anything.  The rule of that function used to be "It must
come from refs/notes/ and nowhere else".  That made sense in the old
world order where both must be from refs/notes/, and the rule still
makes sense in the new world order for the destination of the merge.

The new rule of that function after the proposed patch says "It must
come from either refs/notes or refs/ somewhere".  This does not make
sense for the destination because it is too loose (and we didn't see
any justification why loosening it is a good idea), and it does not
make sense for the source because it still is too tight.  It should
be able to take anything get_sha1() understands (including $THEIRS
in the above example).

In addition you might also want to allow additional DWIMs from X to
refs/remote-notes/*/X as well as from X to refs/notes/X, but that is
secondary and should be done as a follow-up "nice to have" change,
because both "notes/remote-notes/origin/commits" and "notes/commits"
would be understood by get_sha1() already, and it is questinable if
it is a good idea to introduce special DWIMs that kick in only when
the commands are talking about notes in the first place (an equally
questionable alternative is to teach get_sha1() about refs/notes/*
and refs/remote-notes/*/*, which will make the disambiguation noisy
in the normal non-notes codepath---my knee-jerk reaction is to
suggest not to go there, either).

In any case, to get us closer to that endgame, change in the
proposed patch does not help.  It tries to cover two different cases
with a logic that is not a good match to either.  You need to have
two separate helpers to interpret the source and the destination.

Calls expand_notes_ref() made on a command line argument that
specifies the source (which I think is similar to what the other
recent topic calls "read-only") should be made to calls to a more
lenient version (and you can start with get_sha1() for that purpose
without introducing your own DWIMs in the first step), while leaving
calls to expand_notes_ref() for destination and the implementation
of expand_notes_ref() itself unmolested, so that we can keep the
safety in expands_notes_ref() that makes sure that the destination
of a local operation is under refs/notes/*.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-svn: make batch mode optional for git-cat-file

2015-09-22 Thread Junio C Hamano
Victor Leschuk  writes:

> We already do have some of these: 'no-metadata', 'no-checkout',
> no-auth-cache'. So I was just following the existing convention. Do
> you think we need to change it and stick with
> --catch-file-batch=1/--cat-file-batch=0 ?

Inventing a new --cat-file-batch=[0|1] is not a good idea, and
certainly not what I would suggest at all.

My suggestion was to accept --cat-file-batch to allow the --batch
processing, and to accept--no-cat-file-batch to trigger your new
codepath (and leave --cat-file-batch the default when neither is
given).  As these option descriptions are eventually passed to
Getopt::Long, I thought it should not be too hard to arrange.

Mimicking the existing handling of no-whatever is less bad than
accepting --cat-file-batch=[0|1], if you cannot tell the code to
take --[no-]cat-file-batch for whatever reason.  In the longer term
it would need to be cleaned up together with existing ones.  Your
patch would be adding another instance that needs to be cleaned up
to that existing pile, but as long as it follows the same pattern as
existing ones, it is easier to spot what needs to be fixed later.
Compared to that, accepting --cat-file-batch=[0|1] would be far
worse, as such a future clean-up effort can miss it due to its not
following the same pattern.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] git-svn: make batch mode optional for git-cat-file

2015-09-22 Thread Victor Leschuk
As for your remark regarding the option naming: 

> An option whose name begins with no- looks somewhat strange.  You
can even say --no-no-cat-file-batch from the command line, I
suspect.

We already do have some of these: 'no-metadata', 'no-checkout', 
'no-auth-cache'. So I was just following the existing convention. Do you think 
we need to change it and stick with --catch-file-batch=1/--cat-file-batch=0 ?

--
Best Regards,
Victor

From: Victor Leschuk
Sent: Monday, September 21, 2015 3:03 PM
To: Junio C Hamano
Cc: git@vger.kernel.org
Subject: RE: [PATCH] git-svn: make batch mode optional for git-cat-file

Hello Junio,

thanks for your review. First of all I'd like to apologize for sending the 
patch without description. Actually I was in a hurry and sent it by accident: I 
planned to edit the mail before sending...

Here is the detailed description:

Last week we had a quick discussion in this mailing list: 
http://thread.gmane.org/gmane.comp.version-control.git/278021 .

The thing is that git-cat-file keeps growing during work when running in 
"batch" mode. See the figure attached: it is for cloning a rather small repo (1 
hour to clone about ~14000 revisions). However the clone of a large repo 
(~28 revisions) took about 2 weeks and git-cat-file has outgrown the parent 
perl process several times (git-cat-file - ~3-4Gb, perl - 400Mb).

What was done:
 * I have run it under valgrind and mtrace and haven't found any memory leaks
 * Found the source of most number of memory reallocations 
(batch_object_write() function (strbuf_expand -> realloc)) - tried to make the 
streambuf object static and avoid reallocs - didn't help
 * Tried preloading other allocators than standard glibc - no significant 
difference

After that I replaced the batch mode with separate cat-file calls for each blob 
and it didn't have any impact on clone performance on real code repositories. 
However I created a fake test repo with large number of small files (~10 bytes 
each): here is how I created it https://bitbucket.org/vleschuk/svngenrepo

And on this artificial test repo it really slowed down the process. So I 
decided to suggest to make the batch mode optional to let the user "tune" the 
process and created a patch for this.

As for your code-style notes, I agree with them, will adjust the code.

--
Best Regards,
Victor

From: Junio C Hamano [jch2...@gmail.com] On Behalf Of Junio C Hamano 
[gits...@pobox.com]
Sent: Monday, September 21, 2015 11:25 AM
To: Victor Leschuk
Cc: git@vger.kernel.org; Victor Leschuk
Subject: Re: [PATCH] git-svn: make batch mode optional for git-cat-file

Victor Leschuk  writes:

> Signed-off-by: Victor Leschuk 
> ---

Before the S-o-b line is a good place to explain why this is a good
change to have.  Please use it.

>  git-svn.perl |  1 +
>  perl/Git.pm  | 41 -
>  2 files changed, 41 insertions(+), 1 deletion(-)
>
> diff --git a/git-svn.perl b/git-svn.perl
> index 36f7240..b793c26 100755
> --- a/git-svn.perl
> +++ b/git-svn.perl
> @@ -139,6 +139,7 @@ my %fc_opts = ( 'follow-parent|follow!' => 
> \$Git::SVN::_follow_parent,
>   'use-log-author' => \$Git::SVN::_use_log_author,
>   'add-author-from' => \$Git::SVN::_add_author_from,
>   'localtime' => \$Git::SVN::_localtime,
> + 'no-cat-file-batch' => sub { $Git::no_cat_file_batch = 1; },

An option whose name begins with no- looks somewhat strange.  You
can even say --no-no-cat-file-batch from the command line, I
suspect.

Why not give an option 'cat-file-batch' that sets the variable
$Git::cat_file_batch to false, and initialize the variable to true
to keep existing users who do not pass the option happy?

>   %remote_opts );
>
>  my ($_trunk, @_tags, @_branches, $_stdlayout);
> diff --git a/perl/Git.pm b/perl/Git.pm
> index 19ef081..69e5293 100644
> --- a/perl/Git.pm
> +++ b/perl/Git.pm
> @@ -107,6 +107,7 @@ use Fcntl qw(SEEK_SET SEEK_CUR);
>  use Time::Local qw(timegm);
>  }
>
> +our $no_cat_file_batch = 0;
>
>  =head1 CONSTRUCTORS
>
> @@ -1012,6 +1013,10 @@ returns the number of bytes printed.
>  =cut
>
>  sub cat_blob {
> + (1 == $no_cat_file_batch) ? _cat_blob_cmd(@_) : _cat_blob_batch(@_);

Discard "1 ==" here.  You are clearly using the variable as a
boolean, so writing this as

$cat_file_batch ? _cat_blob_batch(@_) : _cat_blob_cmd(@_);

or better yet

if ($cat_file_batch) {
_cat_blob_batch(@_);
} else {
_cat_blob_cmd(@_);
}

would be more natural.

> +}
> +
> +sub _cat_blob_batch {
>   my ($self, $sha1, $fh) = @_;
>
>   $self->_open_cat_blob_if_needed();
> @@ -1072,7 +1077,7 @@ sub cat_blob {
>  sub _open_cat_blob_if_needed {
>   my ($self) = @_;
>
> - return if defined($self->{cat_blob_pid});
> + return if ( defined($self->{cat_blob_pid}) || 1 == $no_cat_file_batch );

Likew

Re: Small mistakes in translation

2015-09-22 Thread Przemek Skrzyniarz
Thanks Ray,

Ray Chen  gmail.com> writes:

> 
> On Tue, 22 Sep 2015, Przemysław Skrzyniarz wrote:
> 
> The issues you found should better be reported to the progit community
> on GitHub: https://github.com/progit/progit

Next message, it means fourth I will send correctly group (of course and
I'll send my first and fourth massages again), because too fast and I sent
message with similar content a moment ago for this group.


 
> Besides, the progit book 2nd version has already been released.  Please
> check https://progit.org/translations to see how to start translation in new
> languages.

Ooo. I've not found polish version :-( But maybe someone starts project
translation again. I'm not a good translator ;-)

Take care,
Przemek


Re: Small mistakes in translation

2015-09-22 Thread Przemek Skrzyniarz
I'm found one more mistake on page =>
https://git-scm.com/book/pl/v1/Rozproszony-Git-Rozproszone-przep%C5%82ywy-pracy
< In third line, before next to last line (Is: tryby => Should be:
trybu) in fourth part of text entitle "Przepływ pracy z dyktatorem i
porucznikami"(place in line - cit.: "... go do używanego przez siebie tryby
pracy ...")

Regards,
Przemysław Skrzyniarz




Re: [PATCH v4 1/2] git-p4: add test case for "Translation of file content failed" error

2015-09-22 Thread Lars Schneider

On 22 Sep 2015, at 01:54, Eric Sunshine  wrote:

> On Mon, Sep 21, 2015 at 7:03 PM, Lars Schneider
>  wrote:
>> On 21 Sep 2015, at 20:09, Junio C Hamano  wrote:
>>> larsxschnei...@gmail.com writes:
 +test_expect_success 'init depot with UTF-16 encoded file and artificially 
 remove BOM' '
 +(
 +cd "db" &&
 +p4d -jc &&
 +# P4D automatically adds a BOM. Remove it here to make the 
 file invalid.
 +sed -e "$ d" depot/file1,v >depot/file1,v.new &&
>>> 
>>> Do you need the space between the address $ (i.e. the last line) and
>>> operation 'd' (i.e. delete it)?  I am asking because that looks very
>>> unusual at least in our codebase.
>> 
>> Well, I am no “sed” pro. I have to admit that I found this snippet
>> on the Internet and it just worked. If I remove the space then it
>> does not work. I was not yet able to figure out why… anyone an idea?
> 
> Yes, it's because $d is a variable reference, even within double
> quotes. Typically, one uses single quotes around the sed argument to
> suppress this sort of undesired behavior. Since the entire test body
> is already within single quotes, however, changing the sed argument to
> use single quotes, rather than double, will require escaping them:
> 
>sed -e \'$d\' depot/file...
> 
> Aside: You could also drop the unnecessary quotes from the 'cd' argument.

Thanks for the explanation. Plus you are correct with the quotes around “db”… 
just a habit.

@Junio: 
If it is no extra work for you, you can remove the quotes around “db”. I can 
also create a new patch roll including the sed change and the quote change if 
it is easier for you.

Best,
Lars

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/2] git-p4: add test case for "Translation of file content failed" error

2015-09-22 Thread Lars Schneider

On 22 Sep 2015, at 03:10, Junio C Hamano  wrote:

> Eric Sunshine  writes:
> 
>> Yes, it's because $d is a variable reference, even within double
>> quotes.
> 
> s/even/especially/ ;-)
> 
> Here is what I queued as SQUASH???
> 
> diff --git a/t/t9825-git-p4-handle-utf16-without-bom.sh 
> b/t/t9825-git-p4-handle-utf16-without-bom.sh
> index 65c3c4e..735c0bb 100644
> --- a/t/t9825-git-p4-handle-utf16-without-bom.sh
> +++ b/t/t9825-git-p4-handle-utf16-without-bom.sh
> @@ -22,8 +22,8 @@ test_expect_success 'init depot with UTF-16 encoded file 
> and artificially remove
>   cd "db" &&
>   p4d -jc &&
>   # P4D automatically adds a BOM. Remove it here to make the file 
> invalid.
> - sed -e "$ d" depot/file1,v >depot/file1,v.new &&
> - mv -- depot/file1,v.new depot/file1,v &&
> + sed -e "\$d" depot/file1,v >depot/file1,v.new &&
> + mv depot/file1,v.new depot/file1,v &&
>   printf "@$UTF16@" >>depot/file1,v &&
>   p4d -jrF checkpoint.1
>   )

This works. I even tested successfully this one:

sed \$d depot/file1,v >depot/file1,v.new &&

Do we need the “-e” option?

Thanks,
Lars


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Small mistakes in translation

2015-09-22 Thread Ray Chen
On Tue, 22 Sep 2015, Przemysław Skrzyniarz wrote:

> Hi,
> 
> On page =>
> https://git-scm.com/book/pl/v1/Rozproszony-Git-Rozproszone-przep%C5%82ywy-pracy
> < I found typographical error in first line (Is: powala => Should be:
> pozwala) in third part of text entitle "Przepływ pracy z osobą integrującą
> zmiany".
> And next line, it means third line at the bottom - of text entitle "Przepływ
> pracy z osobą integrującą zmiany" - is better to use "z" instead "do" (place
> in line - cit.: "... uprawnienia do odczytu do repozytorium innych osób w
> zespole ...")

The issues you found should better be reported to the progit community
on GitHub: https://github.com/progit/progit

Besides, the progit book 2nd version has already been released.  Please
check https://progit.org/translations to see how to start translation in new
languages.

Cheers,
Ray
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git-svn: Why not use git-fast-import?

2015-09-22 Thread Victor Leschuk
Hello all, 

I've been playing with git-svn for some time already and as it seems to me 
there are two most important problems which make it hard to use in production 
for large repositories. Very low speed and large memory footprint of 
synchronization with SVN repos (I am talking about clone and fetch operations 
mostly). I was wondering why the git-fast-import is not used for these 
purposes? Are there any serious limitations which make it impossible?

I have found several alternative solutions which use git-fast-import but they 
all do only the initial import of a repo. I have looked through the 
documentation and didn't see why fast-import can't be used to sync an existing 
repo after the import. Am I missing something?

Thanks in advance for clarification.

--
Best Regards,
Victor--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Small mistakes in translation

2015-09-22 Thread Przemysław Skrzyniarz
Hi,

On page =>
https://git-scm.com/book/pl/v1/Rozproszony-Git-Rozproszone-przep%C5%82ywy-pracy
< I found typographical error in first line (Is: powala => Should be:
pozwala) in third part of text entitle "Przepływ pracy z osobą integrującą
zmiany".
And next line, it means third line at the bottom - of text entitle "Przepływ
pracy z osobą integrującą zmiany" - is better to use "z" instead "do" (place
in line - cit.: "... uprawnienia do odczytu do repozytorium innych osób w
zespole ...")

Best wishes,
Przemysław 
SkrzyniarzN�r��yb�X��ǧv�^�)޺{.n�+ا���ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf