sb/submodule-parallel-fetch, was: Re: What's cooking in git.git (Dec 2015, #03; Thu, 10)

2015-12-11 Thread Johannes Sixt
Am 11.12.2015 um 00:55 schrieb Stefan Beller:
> On Thu, Dec 10, 2015 at 3:51 PM, Junio C Hamano  wrote:
>> Stefan Beller  writes:
>>
 * sb/submodule-parallel-fetch (2015-11-24) 17 commits
 ...
>>>
>>> I assume you plan on merging this after 2.7 settled and then we can
>>> also get the above sb/submodule-parallel-update going again.
>>
>> Yeah, thanks for reminding me.  I think that would be a good plan
>> that gives us an opportunity to clean up this topic, some parts of
>> which are have "an early patch that was too hastily merged to 'next'
>> had to be tweaked by an 'oops' follow-up patch in the topic"
>> pattern, e.g. "make waitpid the secondary and closed pipe the
>> primary way to monitor children".
>>
>> Thanks.
> 
> This makes it sound as if you would drop it from next once 2.7 is out,
> expecting a complete reroll, which does the right thing from the beginning?
> That was not was I was expecting, but thanks for clarifying.

Also, rebuilding the topic such that it takes the direct route to its
current state would help bisectability on Windows.

Generally, I'm already quite satisfied with the state of the
infrastructure at the tip of the branch.

Nevertheless, I have a few niggles.

The primary one is that we are using a global variable of type
struct parallel_processes to keep track of the processes. Fortunately,
most functions already take a pointer (I gather you did anticipate an
object oriented use of the functions). The only exception is pp_init():
It returns a pointer to the global object, which is then passed around
to the other functions. This does not conform to our usual style,
however, where the initializer function takes a pointer to the object,
too.

After converting pp_init, we can have a nicely object oriented
collection of functions and get rid of the global object ... almost.

We still need a global variable for the signal handler. But since
signals and their handlers are a global resource, it is not that bad to
have a global variable that is dedicated to signal handling.

Another small nit is that I found it confusing that the closure
parameters are arranged differently in the callback functions. Granted,
in get_next_task one of them is an out parameter, but that is actually
an argument to move it to the end of the parameter list, IMHO.

On top of that I found an error or two in the documentation, and I have
a few suggestions for improvements.

All this is summarized in the patch below.

diff --git a/run-command.c b/run-command.c
index db4d916..f3addb9 100644
--- a/run-command.c
+++ b/run-command.c
@@ -864,7 +864,7 @@ enum child_state {
GIT_CP_WAIT_CLEANUP,
 };
 
-static struct parallel_processes {
+struct parallel_processes {
void *data;
 
int max_processes;
@@ -890,7 +890,7 @@ static struct parallel_processes {
 
int output_owner;
struct strbuf buffered_output; /* of finished children */
-} parallel_processes_struct;
+};
 
 static int default_start_failure(struct child_process *cp,
 struct strbuf *err,
@@ -933,23 +933,23 @@ static void kill_children(struct parallel_processes *pp, 
int signo)
kill(pp->children[i].process.pid, signo);
 }
 
+static struct parallel_processes *pp_for_signal;
+
 static void handle_children_on_signal(int signo)
 {
-   struct parallel_processes *pp = _processes_struct;
-
-   kill_children(pp, signo);
+   kill_children(pp_for_signal, signo);
sigchain_pop(signo);
raise(signo);
 }
 
-static struct parallel_processes *pp_init(int n,
- get_next_task_fn get_next_task,
- start_failure_fn start_failure,
- task_finished_fn task_finished,
- void *data)
+static void pp_init(struct parallel_processes *pp,
+   int n,
+   get_next_task_fn get_next_task,
+   start_failure_fn start_failure,
+   task_finished_fn task_finished,
+   void *data)
 {
int i;
-   struct parallel_processes *pp = _processes_struct;
 
if (n < 1)
n = online_cpus();
@@ -976,8 +976,9 @@ static struct parallel_processes *pp_init(int n,
pp->pfd[i].events = POLLIN | POLLHUP;
pp->pfd[i].fd = -1;
}
+
+   pp_for_signal = pp;
sigchain_push_common(handle_children_on_signal);
-   return pp;
 }
 
 static void pp_cleanup(struct parallel_processes *pp)
@@ -1019,10 +1020,10 @@ static int pp_start_one(struct parallel_processes *pp)
if (i == pp->max_processes)
die("BUG: bookkeeping is hard");
 
-   code = pp->get_next_task(>children[i].data,
->children[i].process,
+   code = pp->get_next_task(>children[i].process,
 

Re: sb/submodule-parallel-fetch, was: Re: What's cooking in git.git (Dec 2015, #03; Thu, 10)

2015-12-11 Thread Stefan Beller
On Fri, Dec 11, 2015 at 1:37 PM, Johannes Sixt  wrote:
>
> Generally, I'm already quite satisfied with the state of the
> infrastructure at the tip of the branch.
>

I was about the rework the patch series.

> Nevertheless, I have a few niggles.

If you don't mind I'll split your patch into chunks and
squash these where appropriate and resend the series
with the suggestions included without the intermediate stages
of non compiling code for Windows.

>
> The primary one is that we are using a global variable of type
> struct parallel_processes to keep track of the processes. Fortunately,
> most functions already take a pointer (I gather you did anticipate an
> object oriented use of the functions). The only exception is pp_init():
> It returns a pointer to the global object, which is then passed around
> to the other functions. This does not conform to our usual style,
> however, where the initializer function takes a pointer to the object,
> too.

Noted. I thought about the init function as a constructor,
such as

foo *pp = new foo();

in C++ would be just syntactic sugar for what I did there.
I'll adhere to Git style then instead.

>
> After converting pp_init, we can have a nicely object oriented
> collection of functions and get rid of the global object ... almost.
>
> We still need a global variable for the signal handler. But since
> signals and their handlers are a global resource, it is not that bad to
> have a global variable that is dedicated to signal handling.
>
> Another small nit is that I found it confusing that the closure
> parameters are arranged differently in the callback functions. Granted,
> in get_next_task one of them is an out parameter, but that is actually
> an argument to move it to the end of the parameter list, IMHO.
>
> On top of that I found an error or two in the documentation, and I have
> a few suggestions for improvements.
>
> All this is summarized in the patch below.

Thanks for the improvements!

Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sb/submodule-parallel-fetch, was: Re: What's cooking in git.git (Dec 2015, #03; Thu, 10)

2015-12-11 Thread Johannes Sixt

Am 11.12.2015 um 23:52 schrieb Stefan Beller:

If you don't mind I'll split your patch into chunks and
squash these where appropriate


I don't mind at all; please pick what you like.

-- Hannes

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html