On Tue, Feb 9, 2016 at 2:34 PM, Jonathan Nieder <[email protected]> wrote:
> Hi,
>
> Stefan Beller wrote:
>
>> +++ b/submodule.c
> [...]
>> @@ -169,7 +170,13 @@ void set_diffopt_flags_from_submodule_config(struct
>> diff_options *diffopt,
>>
>> int submodule_config(const char *var, const char *value, void *cb)
>> {
>> - if (starts_with(var, "submodule."))
>> + if (!strcmp(var, "submodule.fetchjobs")) {
>> + unsigned long val;
>> + if (!git_parse_ulong(value, &val) || 0 > val || val >= INT_MAX)
>> + die(_("Error parsing submodule.fetchJobs %s"), value);
>
> 'val < 0' would be more idiomatic than '0 > val'. More importantly,
> val is an unsigned long. How could it be negative?
>
> Is it intended that val == INT_MAX is not permitted? I would have
> expected something like the following to work:
>
> unsigned long val = git_config_ulong(var, value);
> if (val > INT_MAX) {
> errno = ERANGE;
> die_bad_number(var, value);
> }
>
> (using die_bad_number from config.c). Or config.c could gain a
> git_config_nonnegative_int helper:
>
> static int git_parse_nonnegative_int(const char *value, int *ret)
> {
> uintmax_t tmp;
> if (!git_parse_unsigned(value, &tmp,
> maximum_signed_value_of_type(int)))
> return 0;
> *ret = tmp;
> return 1;
> }
>
> int git_config_nonnegative_int(const char *name, const char *value)
> {
> int ret;
> if (!git_parse_nonnegative_int(value, &ret))
> die_bad_number(name, value);
> return ret;
> }
>
> allowing
>
> parallel_jobs = git_config_nonnegative_int(var, val);
> return 0;
And I thought we wanted to prevent inventing yet another helper?
Although this looks very appealing, so let's do that instead.
>
> [...]
>> @@ -751,6 +758,9 @@ int fetch_populated_submodules(const struct argv_array
>> *options,
>> argv_array_push(&spf.args, "--recurse-submodules-default");
>> /* default value, "--submodule-prefix" and its value are added later */
>>
>> + if (max_parallel_jobs < 0)
>> + max_parallel_jobs = parallel_jobs;
>
> Makes sense.
>
> [...]
>> @@ -1097,3 +1107,8 @@ void connect_work_tree_and_git_dir(const char
>> *work_tree, const char *git_dir)
>> strbuf_release(&rel_path);
>> free((void *)real_work_tree);
>> }
>> +
>> +int parallel_submodules(void)
>> +{
>> + return parallel_jobs;
>> +}
>
> Is this helper used?
Not yet, but I rather want to introduce it here as it is easier to review here
than in a later patch?
>
> [...]
>> +++ b/submodule.h
>> @@ -41,5 +41,6 @@ int find_unpushed_submodules(unsigned char new_sha1[20],
>> const char *remotes_nam
>> struct string_list *needs_pushing);
>> int push_unpushed_submodules(unsigned char new_sha1[20], const char
>> *remotes_name);
>> void connect_work_tree_and_git_dir(const char *work_tree, const char
>> *git_dir);
>> +int parallel_submodules(void);
>
> optional trick: one way to avoid merge conflicts is to add each function
> at some logical place in the file instead of at the end. Another nice
> side-effect is that it makes it easier to read through the header since
> functions appear in some appropriate order.
>
> E.g. this method could go near the config functions, I suppose.
I was not sure how relevant this is to configuration as it seems a bit
special w.r.t. submodules to me (it's not configured in .gitmodules, but a
standard configuration in .git/config after all).
>
>> --- a/t/t5526-fetch-submodules.sh
>> +++ b/t/t5526-fetch-submodules.sh
>> @@ -471,4 +471,18 @@ test_expect_success "don't fetch submodule when newly
>> recorded commits are alrea
>> test_i18ncmp expect.err actual.err
>> '
>>
>> +test_expect_success 'fetching submodules respects parallel settings' '
>
> Makes sense. Same trick about inserting in some appropriate place in
> the middle of the file applies here, too. In tests it also ends up
> being useful for finding when tests overlap or when there's a gap in
> coverage.
>
> The documentation says that '0' does something appropriate (DWIM or
> something? I didn't understand it). Perhaps a test for that behavior
> would be useful, too.
The 0 is currently using 'online_cpus()' by default as a 'something
appropriate',
which is not really appropriate though. The documentation is worded
such that people should not rely on certain behaviors of '0'.
Maybe instead we could just hardcode 0 to 2 for now and test for that
instead. My reasoning for '2':
* provides the least amount of parallelism
* when fetching/cloning you have phases when the server works or when
the client works. In absence of any good metric, I estimate the work load
to be split 50:50. If that is the case the actual optimal number is 2 with
these two jobs being shifted such that each client and server are
loaded all the time.
>
> Thanks,
> Jonathan
Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html