Re: [PATCH v2 5/5] fetch: Make --jobs control submodules and remotes
On Wed, 14 Aug 2019 01:32:45 PDT (-0700), szeder@gmail.com wrote: On Mon, Aug 12, 2019 at 02:34:48PM -0700, Palmer Dabbelt wrote: diff --git a/builtin/fetch.c b/builtin/fetch.c index 67d001f3f78b..41498e9efb3b 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -114,6 +114,20 @@ static int git_fetch_config(const char *k, const char *v, void *cb) return git_default_config(k, v, cb); } +static int parse_jobs_arg(const struct option *opt, const char *arg, int unset) +{ + int jobs; + + jobs = atoi(arg); + if (jobs < 1) + die(_("There must be a positive number of jobs")); + + max_children_for_submodules = jobs; + max_children_for_fetch = jobs; + + return 0; +} + static int parse_refmap_arg(const struct option *opt, const char *arg, int unset) { BUG_ON_OPT_NEG(unset); @@ -142,12 +156,13 @@ static struct option builtin_fetch_options[] = { N_("fetch all tags and associated objects"), TAGS_SET), OPT_SET_INT('n', NULL, &tags, N_("do not fetch all tags (--no-tags)"), TAGS_UNSET), - OPT_INTEGER('j', "jobs", &max_children_for_submodules, + { OPTION_CALLBACK, 'j', "jobs", NULL, N_("jobs"), + N_("number of parallel tasks to run while fetching"), + PARSE_OPT_OPTARG, &parse_jobs_arg }, These changes result segmentation faults in the tests '--quiet propagates to parallel submodules' and 'fetching submodules respects parallel settings' in 't5526-fetch-submodules.sh'. If the number of jobs is specified as '-j 2' or '--jobs 7', i.e. as an unstuck argument of the option, as opposed to '-j2' or '--jobs=7', then 'arg' in the parse_jobs_arg() callback is NULL, which then causes the segfault somewhere inside that atoi() call. Thanks, I'll fix that in the v3.
Re: [PATCH v2 5/5] fetch: Make --jobs control submodules and remotes
SZEDER Gábor writes: > On Mon, Aug 12, 2019 at 02:34:48PM -0700, Palmer Dabbelt wrote: >> diff --git a/builtin/fetch.c b/builtin/fetch.c >> index 67d001f3f78b..41498e9efb3b 100644 >> --- a/builtin/fetch.c >> +++ b/builtin/fetch.c >> @@ -114,6 +114,20 @@ static int git_fetch_config(const char *k, const char >> *v, void *cb) >> return git_default_config(k, v, cb); >> } >> >> +static int parse_jobs_arg(const struct option *opt, const char *arg, int >> unset) >> +{ >> +int jobs; >> + >> +jobs = atoi(arg); >> +if (jobs < 1) >> +die(_("There must be a positive number of jobs")); >> + >> +max_children_for_submodules = jobs; >> +max_children_for_fetch = jobs; >> + >> +return 0; >> +} >> + >> static int parse_refmap_arg(const struct option *opt, const char *arg, int >> unset) >> { >> BUG_ON_OPT_NEG(unset); >> @@ -142,12 +156,13 @@ static struct option builtin_fetch_options[] = { >> N_("fetch all tags and associated objects"), TAGS_SET), >> OPT_SET_INT('n', NULL, &tags, >> N_("do not fetch all tags (--no-tags)"), TAGS_UNSET), >> -OPT_INTEGER('j', "jobs", &max_children_for_submodules, >> +{ OPTION_CALLBACK, 'j', "jobs", NULL, N_("jobs"), >> +N_("number of parallel tasks to run while fetching"), >> +PARSE_OPT_OPTARG, &parse_jobs_arg }, > > These changes result segmentation faults in the tests '--quiet > propagates to parallel submodules' and 'fetching submodules respects > parallel settings' in 't5526-fetch-submodules.sh'. > > If the number of jobs is specified as '-j 2' or '--jobs 7', i.e. as an > unstuck argument of the option, as opposed to '-j2' or '--jobs=7', > then 'arg' in the parse_jobs_arg() callback is NULL, which then causes > the segfault somewhere inside that atoi() call. True. An easier and more readable way would be to set another "nr-jobs" variable using plain vanilla OPT_INTEGER() and override the other two with it when they are not set after parse_options() returns, I guess.
Re: [PATCH v2 5/5] fetch: Make --jobs control submodules and remotes
On Mon, Aug 12, 2019 at 02:34:48PM -0700, Palmer Dabbelt wrote: > diff --git a/builtin/fetch.c b/builtin/fetch.c > index 67d001f3f78b..41498e9efb3b 100644 > --- a/builtin/fetch.c > +++ b/builtin/fetch.c > @@ -114,6 +114,20 @@ static int git_fetch_config(const char *k, const char > *v, void *cb) > return git_default_config(k, v, cb); > } > > +static int parse_jobs_arg(const struct option *opt, const char *arg, int > unset) > +{ > + int jobs; > + > + jobs = atoi(arg); > + if (jobs < 1) > + die(_("There must be a positive number of jobs")); > + > + max_children_for_submodules = jobs; > + max_children_for_fetch = jobs; > + > + return 0; > +} > + > static int parse_refmap_arg(const struct option *opt, const char *arg, int > unset) > { > BUG_ON_OPT_NEG(unset); > @@ -142,12 +156,13 @@ static struct option builtin_fetch_options[] = { > N_("fetch all tags and associated objects"), TAGS_SET), > OPT_SET_INT('n', NULL, &tags, > N_("do not fetch all tags (--no-tags)"), TAGS_UNSET), > - OPT_INTEGER('j', "jobs", &max_children_for_submodules, > + { OPTION_CALLBACK, 'j', "jobs", NULL, N_("jobs"), > + N_("number of parallel tasks to run while fetching"), > + PARSE_OPT_OPTARG, &parse_jobs_arg }, These changes result segmentation faults in the tests '--quiet propagates to parallel submodules' and 'fetching submodules respects parallel settings' in 't5526-fetch-submodules.sh'. If the number of jobs is specified as '-j 2' or '--jobs 7', i.e. as an unstuck argument of the option, as opposed to '-j2' or '--jobs=7', then 'arg' in the parse_jobs_arg() callback is NULL, which then causes the segfault somewhere inside that atoi() call.
Re: [PATCH v2 5/5] fetch: Make --jobs control submodules and remotes
On Tue, 13 Aug 2019 13:16:11 PDT (-0700), johannes.schinde...@gmx.de wrote: Hi, On Mon, 12 Aug 2019, Palmer Dabbelt wrote: The existing --jobs argument was defined to control the number of jobs used for submodule fetching, but it makes more sense to have this argument control the number of jobs to be used when fetching from multiple remotes as well. This patch simply changes the --jobs argument parsing code to set both max_children_for_{submodules,fetch}, as well as noting this new behavior in the documentation. Signed-off-by: Palmer Dabbelt --- I very much miss in this description a reflection of my analysis in https://public-inbox.org/git/nycvar.qro.7.76.6.1907191507420...@tvgsbejvaqbjf.bet/ Given that analysis, combined with the fact that the `--jobs` option tries to control both the `--multiple` and `--recursive-submodules` code paths in the end, anyway, I do doubt that it makes sense to even introduce the `--fetch-jobs` and the `--submodule-fetch-jobs` options; They are probably only confusing and do not add much benefit to the end user. The cover letter at least attempts to describe this. I figured I'd have to pick one option for a v2, so I went with the more complicated one under the assumption it would be easy to re-spin a v3 that drops the extra arguments. I'm happy to do so. Ciao, Johannes Documentation/fetch-options.txt | 4 builtin/fetch.c | 21 ++--- 2 files changed, 22 insertions(+), 3 deletions(-) diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt index 5836024f1934..0915fd4ed6d5 100644 --- a/Documentation/fetch-options.txt +++ b/Documentation/fetch-options.txt @@ -160,6 +160,10 @@ ifndef::git-pull[] -j:: --jobs=:: + Number of parallel children to be used for all forms of fetching. + This is the same as passing `--submodule-fetch-jobs=` and + `--fetch-jobs=`. + --submodule-fetch-jobs=:: Number of parallel children to be used for fetching submodules. Each will fetch from different submodules, such that fetching many diff --git a/builtin/fetch.c b/builtin/fetch.c index 67d001f3f78b..41498e9efb3b 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -114,6 +114,20 @@ static int git_fetch_config(const char *k, const char *v, void *cb) return git_default_config(k, v, cb); } +static int parse_jobs_arg(const struct option *opt, const char *arg, int unset) +{ + int jobs; + + jobs = atoi(arg); + if (jobs < 1) + die(_("There must be a positive number of jobs")); + + max_children_for_submodules = jobs; + max_children_for_fetch = jobs; + + return 0; +} + static int parse_refmap_arg(const struct option *opt, const char *arg, int unset) { BUG_ON_OPT_NEG(unset); @@ -142,12 +156,13 @@ static struct option builtin_fetch_options[] = { N_("fetch all tags and associated objects"), TAGS_SET), OPT_SET_INT('n', NULL, &tags, N_("do not fetch all tags (--no-tags)"), TAGS_UNSET), - OPT_INTEGER('j', "jobs", &max_children_for_submodules, + { OPTION_CALLBACK, 'j', "jobs", NULL, N_("jobs"), + N_("number of parallel tasks to run while fetching"), + PARSE_OPT_OPTARG, &parse_jobs_arg }, + OPT_INTEGER(0, "submodule-fetch-jobs", &max_children_for_submodules, N_("number of submodules fetched in parallel")), OPT_INTEGER(0, "fetch-jobs", &max_children_for_fetch, N_("number of remotes fetched in parallel")), - OPT_INTEGER(0, "submodule-fetch-jobs", &max_children_for_submodules, - N_("number of submodules fetched in parallel")), OPT_BOOL('p', "prune", &prune, N_("prune remote-tracking branches no longer on remote")), OPT_BOOL('P', "prune-tags", &prune_tags, -- 2.21.0
Re: [PATCH v2 5/5] fetch: Make --jobs control submodules and remotes
On Tue, 13 Aug 2019 15:00:13 PDT (-0700), gits...@pobox.com wrote: Palmer Dabbelt writes: The existing --jobs argument was defined to control the number of jobs used for submodule fetching, but it makes more sense to have this argument control the number of jobs to be used when fetching from multiple remotes as well. This patch simply changes the --jobs argument parsing code to set both max_children_for_{submodules,fetch}, as well as noting this new behavior in the documentation. That's a sensible, if overly careful, transition plan. This patch cannot be queued together with the other four, though, for the plan to be practical. It probably needs to come a few releases after the other four hits a release. A less involved and much more careless transition plan may be to just declare that "--jobs that only controls submodule fetches is a bug and it must also affect how fetches from multiple remote repositories are done" and come directly to this step, without necessarily introducing options that control them independently. I have a suspicion that we can afford to go the careless route for this particular one, if we wanted to, as not may people would care about controlling these independently. This was brought up as part of the v1, and the cover letter lays out a plan to do so. I'm happy to re-spin the patch set to just have --jobs control everything.
Re: [PATCH v2 5/5] fetch: Make --jobs control submodules and remotes
Palmer Dabbelt writes: > The existing --jobs argument was defined to control the number of jobs > used for submodule fetching, but it makes more sense to have this > argument control the number of jobs to be used when fetching from > multiple remotes as well. > > This patch simply changes the --jobs argument parsing code to set both > max_children_for_{submodules,fetch}, as well as noting this new behavior > in the documentation. That's a sensible, if overly careful, transition plan. This patch cannot be queued together with the other four, though, for the plan to be practical. It probably needs to come a few releases after the other four hits a release. A less involved and much more careless transition plan may be to just declare that "--jobs that only controls submodule fetches is a bug and it must also affect how fetches from multiple remote repositories are done" and come directly to this step, without necessarily introducing options that control them independently. I have a suspicion that we can afford to go the careless route for this particular one, if we wanted to, as not may people would care about controlling these independently.
Re: [PATCH v2 5/5] fetch: Make --jobs control submodules and remotes
Hi, On Mon, 12 Aug 2019, Palmer Dabbelt wrote: > The existing --jobs argument was defined to control the number of jobs > used for submodule fetching, but it makes more sense to have this > argument control the number of jobs to be used when fetching from > multiple remotes as well. > > This patch simply changes the --jobs argument parsing code to set both > max_children_for_{submodules,fetch}, as well as noting this new behavior > in the documentation. > > Signed-off-by: Palmer Dabbelt > --- I very much miss in this description a reflection of my analysis in https://public-inbox.org/git/nycvar.qro.7.76.6.1907191507420...@tvgsbejvaqbjf.bet/ Given that analysis, combined with the fact that the `--jobs` option tries to control both the `--multiple` and `--recursive-submodules` code paths in the end, anyway, I do doubt that it makes sense to even introduce the `--fetch-jobs` and the `--submodule-fetch-jobs` options; They are probably only confusing and do not add much benefit to the end user. Ciao, Johannes > Documentation/fetch-options.txt | 4 > builtin/fetch.c | 21 ++--- > 2 files changed, 22 insertions(+), 3 deletions(-) > > diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt > index 5836024f1934..0915fd4ed6d5 100644 > --- a/Documentation/fetch-options.txt > +++ b/Documentation/fetch-options.txt > @@ -160,6 +160,10 @@ ifndef::git-pull[] > > -j:: > --jobs=:: > + Number of parallel children to be used for all forms of fetching. > + This is the same as passing `--submodule-fetch-jobs=` and > + `--fetch-jobs=`. > + > --submodule-fetch-jobs=:: > Number of parallel children to be used for fetching submodules. > Each will fetch from different submodules, such that fetching many > diff --git a/builtin/fetch.c b/builtin/fetch.c > index 67d001f3f78b..41498e9efb3b 100644 > --- a/builtin/fetch.c > +++ b/builtin/fetch.c > @@ -114,6 +114,20 @@ static int git_fetch_config(const char *k, const char > *v, void *cb) > return git_default_config(k, v, cb); > } > > +static int parse_jobs_arg(const struct option *opt, const char *arg, int > unset) > +{ > + int jobs; > + > + jobs = atoi(arg); > + if (jobs < 1) > + die(_("There must be a positive number of jobs")); > + > + max_children_for_submodules = jobs; > + max_children_for_fetch = jobs; > + > + return 0; > +} > + > static int parse_refmap_arg(const struct option *opt, const char *arg, int > unset) > { > BUG_ON_OPT_NEG(unset); > @@ -142,12 +156,13 @@ static struct option builtin_fetch_options[] = { > N_("fetch all tags and associated objects"), TAGS_SET), > OPT_SET_INT('n', NULL, &tags, > N_("do not fetch all tags (--no-tags)"), TAGS_UNSET), > - OPT_INTEGER('j', "jobs", &max_children_for_submodules, > + { OPTION_CALLBACK, 'j', "jobs", NULL, N_("jobs"), > + N_("number of parallel tasks to run while fetching"), > + PARSE_OPT_OPTARG, &parse_jobs_arg }, > + OPT_INTEGER(0, "submodule-fetch-jobs", &max_children_for_submodules, > N_("number of submodules fetched in parallel")), > OPT_INTEGER(0, "fetch-jobs", &max_children_for_fetch, > N_("number of remotes fetched in parallel")), > - OPT_INTEGER(0, "submodule-fetch-jobs", &max_children_for_submodules, > - N_("number of submodules fetched in parallel")), > OPT_BOOL('p', "prune", &prune, >N_("prune remote-tracking branches no longer on remote")), > OPT_BOOL('P', "prune-tags", &prune_tags, > -- > 2.21.0 > >