Re: [PATCH v2 5/9] gpg-interface: introduce an abstraction for multiple gpg formats
Am Tue, 10 Jul 2018 12:23:32 -0400 schrieb Jeff King : > On Tue, Jul 10, 2018 at 10:52:27AM +0200, Henning Schild wrote: > > > Create a struct that holds the format details for the supported > > formats. At the moment that is still just "openpgp". This commit > > prepares for the introduction of more formats, that might use other > > programs and match other signatures. > > Great, this looks like a good incremental step. > > > static char *configured_signing_key; > > -static const char *gpg_format = "openpgp"; > > -static const char *gpg_program = "gpg"; > > +struct gpg_format_data { > > + const char *format; > > + const char *program; > > + const char *extra_args_verify[1]; > > + const char *sigs[2]; > > +}; > > These magic numbers are at a weird distance from where we fill them > in: > > > +struct gpg_format_data gpg_formats[] = { > > + { .format = "openpgp", .program = "gpg", > > + .extra_args_verify = { "--keyid-format=long" }, > > + .sigs = { PGP_SIGNATURE, PGP_MESSAGE } > > + }, > > +}; > > I'm not sure if we can easily do any better in C, though. Declaring > the struct with an open-ended "[]" would make the compiler unhappy. > We could do something like: > > struct gpg_format_data { > ... > const char **extra_args_verify; > }; > ... > static const char *openpgp_verify_args[] = { > "--key-id-format=long" > }; > ... > static struct gpg_format_data gpg_formats[] = { > { ... > .extra_args_verify = openpgp_verify_args > } > }; > > I'm not sure if that's more horrible or less. It's worse to write in > the first place, but it's slightly easier to maintain going forward. I > dunno. I switched to that, looks good but i am also not sure which one is better. > > +enum gpgformats { PGP_FMT }; > > Looks like we use this only for indexing the gpg_formats array. I know > that C guarantees 0-indexing, but if we're depending on it, it might > be worth writing out "PGP_FMT = 0" explicitly. And probably adding a > comment that this needs to remain in sync with the array. > > The other alternative is that we could simply use > get_format_data("openpgp"), though that does add a minor runtime cost. Thanks, i got rid of the enum and use get_format_data with two fixed strings for the two possible matches. > > +struct gpg_format_data gpg_formats[] = { > > + { .format = "openpgp", .program = "gpg", > > + .extra_args_verify = { "--keyid-format=long" }, > > + .sigs = { PGP_SIGNATURE, PGP_MESSAGE } > > + }, > > +}; > > This array should be marked static, I think. Yes. > > +static struct gpg_format_data *get_format_data(const char *str) > > +{ > > + int i; > > + for (i = 0; i < ARRAY_SIZE(gpg_formats); i++) > > + if (!strcasecmp(gpg_formats[i].format, str)) > > + return gpg_formats + i; > > + return NULL; > > +} > > This looks much nicer than the assert()-ing version from v1. > > > +static struct gpg_format_data *get_format_data_by_sig(const char > > *sig) +{ > > + int i, j; > > + for (i = 0; i < ARRAY_SIZE(gpg_formats); i++) > > + for (j = 0; j < ARRAY_SIZE(gpg_formats[i].sigs); > > j++) > > + if (gpg_formats[i].sigs[j] && > > + !strncmp(gpg_formats[i].sigs[j], sig, > > + > > strlen(gpg_formats[i].sigs[j]))) > > + return gpg_formats + i; > > + return NULL; > > +} > > This might be a little more readable with: > > starts_with(sig, gpg_formats[i].sigs[j]) > > instead of the strncmp. It may also be more efficient, as we don't > have to compute the strlen of the prefix for each non-matching line > (the compiler _might_ be smart enough to realize these are all string > literals, but it's pretty buried). Thanks, will do. > I also wondered if our prefix matching here is overly loose. We have > to do a prefix match, since "sig" isn't terminated at the line > buffer. So I think we'd match: > > --- BEGIN PGP MESSAGE --- AND SOME OTHER STUFF --- > > on a line. But I think that's no different than the current code. If > we care, I guess we could look for '\n' or '\0' immediately after. Let us not care, just like current code. > > static int is_gpg_start(const char *line) > > { > > - return starts_with(line, PGP_SIGNATURE) || > > - starts_with(line, PGP_MESSAGE); > > + return (get_format_data_by_sig(line) != NULL); > > } > > I don't know if we've ever discussed this style explicitly, but we'd > usually omit the unnecessary parentheses for the return here. Will remove those braces. > > @@ -140,18 +173,14 @@ int git_gpg_config(const char *var, const > > char *value, void *cb) } > > > > if (!strcmp(var, "gpg.format")) { > > - if (strcasecmp(value, "openpgp")) > > + if (!get_format_data(value)) > > return error("malformed value for %s: %s", > > var, value); return git_config_string(_format, var, value); > > } > > Much nicer
Re: [PATCH v2 5/9] gpg-interface: introduce an abstraction for multiple gpg formats
Am Tue, 10 Jul 2018 10:16:39 -0700 schrieb Junio C Hamano : > Henning Schild writes: > > > Create a struct that holds the format details for the supported > > formats. At the moment that is still just "openpgp". This commit > > prepares for the introduction of more formats, that might use other > > programs and match other signatures. > > > > Signed-off-by: Henning Schild > > --- > > gpg-interface.c | 74 > > - 1 file > > changed, 58 insertions(+), 16 deletions(-) > > > > diff --git a/gpg-interface.c b/gpg-interface.c > > index ed0e55917..0a8d1bff3 100644 > > --- a/gpg-interface.c > > +++ b/gpg-interface.c > > @@ -7,12 +7,46 @@ > > #include "tempfile.h" > > > > static char *configured_signing_key; > > -static const char *gpg_format = "openpgp"; > > -static const char *gpg_program = "gpg"; > > +struct gpg_format_data { > > + const char *format; > > + const char *program; > > + const char *extra_args_verify[1]; > > + const char *sigs[2]; > > +}; > > Do you use identifier "gpg_format" elsewhere as a structure name? A > structure is there always to hold "data", so often "_data" suffix is > meaningless in a structure type name like this. i renamed the struct to "gpg_format" and what was "gpg_format" before, a const char* holding the format we use for signing, is now a pointer into gpg_formats and called "current_format" > > #define PGP_SIGNATURE "-BEGIN PGP SIGNATURE-" > > #define PGP_MESSAGE "-BEGIN PGP MESSAGE-" > > > > +enum gpgformats { PGP_FMT }; > > +struct gpg_format_data gpg_formats[] = { > > + { .format = "openpgp", .program = "gpg", > > + .extra_args_verify = { "--keyid-format=long" }, > > + .sigs = { PGP_SIGNATURE, PGP_MESSAGE } > > + }, > > +}; > > +static const char *gpg_format = "openpgp"; > > + > > +static struct gpg_format_data *get_format_data(const char *str) > > get_format_data_by_name() or something like that, for consistency > with the next function, perhaps? I went for get_format_by_name and also renamed "format" inside the struct to be "name" > > +{ > > + int i; > > + for (i = 0; i < ARRAY_SIZE(gpg_formats); i++) > > + if (!strcasecmp(gpg_formats[i].format, str)) > > + return gpg_formats + i; > > + return NULL; > > +} > > + > > +static struct gpg_format_data *get_format_data_by_sig(const char > > *sig) +{ > > + int i, j; > > + for (i = 0; i < ARRAY_SIZE(gpg_formats); i++) > > + for (j = 0; j < ARRAY_SIZE(gpg_formats[i].sigs); > > j++) > > + if (gpg_formats[i].sigs[j] && > > + !strncmp(gpg_formats[i].sigs[j], sig, > > + > > strlen(gpg_formats[i].sigs[j]))) > > + return gpg_formats + i; > > + return NULL; > > +} > > + > > void signature_check_clear(struct signature_check *sigc) > > { > > FREE_AND_NULL(sigc->payload); > > @@ -104,8 +138,7 @@ void print_signature_buffer(const struct > > signature_check *sigc, unsigned flags) > > static int is_gpg_start(const char *line) > > { > > - return starts_with(line, PGP_SIGNATURE) || > > - starts_with(line, PGP_MESSAGE); > > + return (get_format_data_by_sig(line) != NULL); > > If this is still called "is_gpg_start()", then shouldn't the > implementation be more like this? > > struct gpg_format *found = get_format_data_by_sig(line); > > return found && found == _formats[PGP_FMT]; > > It probably does not make much difference in the end, as you won't > have functions is_(gpg|gpgsm|somethingelse|yetanother)_start() but > either have a single function "does some x509 looking block start > here?" or "I am expecting gpg and nothing else, does a block for > that start here?" and at that point this will no longer be called > is_gpg_start(), I would expect. I dropped the whole thing since it is now a one-liner and only called on one place. > > size_t parse_signature(const char *buf, size_t size) > > @@ -140,18 +173,14 @@ int git_gpg_config(const char *var, const > > char *value, void *cb) } > > > > if (!strcmp(var, "gpg.format")) { > > - if (strcasecmp(value, "openpgp")) > > + if (!get_format_data(value)) > > return error("malformed value for %s: %s", > > var, value); return git_config_string(_format, var, value); > > This is a bug introduced in an earlier step in the series, but you > will segfault when given a configuration that looks like this: > > [gpg] > format > > Imitate the way how !value is diagnosed upfront for "gpg.program" > below before this patch. The next version will take care of this. > > } > > > > - if (!strcmp(var, "gpg.program")) { > > - if (!value) > > - return config_error_nonbool(var); > > - gpg_program = xstrdup(value); > > - return 0; > > - } > > - > > + if (!strcmp(var, "gpg.program")) > > + return > >
Re: [PATCH v2 5/9] gpg-interface: introduce an abstraction for multiple gpg formats
Henning Schild writes: > Create a struct that holds the format details for the supported formats. > At the moment that is still just "openpgp". This commit prepares for the > introduction of more formats, that might use other programs and match > other signatures. > > Signed-off-by: Henning Schild > --- > gpg-interface.c | 74 > - > 1 file changed, 58 insertions(+), 16 deletions(-) > > diff --git a/gpg-interface.c b/gpg-interface.c > index ed0e55917..0a8d1bff3 100644 > --- a/gpg-interface.c > +++ b/gpg-interface.c > @@ -7,12 +7,46 @@ > #include "tempfile.h" > > static char *configured_signing_key; > -static const char *gpg_format = "openpgp"; > -static const char *gpg_program = "gpg"; > +struct gpg_format_data { > + const char *format; > + const char *program; > + const char *extra_args_verify[1]; > + const char *sigs[2]; > +}; Do you use identifier "gpg_format" elsewhere as a structure name? A structure is there always to hold "data", so often "_data" suffix is meaningless in a structure type name like this. > #define PGP_SIGNATURE "-BEGIN PGP SIGNATURE-" > #define PGP_MESSAGE "-BEGIN PGP MESSAGE-" > > +enum gpgformats { PGP_FMT }; > +struct gpg_format_data gpg_formats[] = { > + { .format = "openpgp", .program = "gpg", > + .extra_args_verify = { "--keyid-format=long" }, > + .sigs = { PGP_SIGNATURE, PGP_MESSAGE } > + }, > +}; > +static const char *gpg_format = "openpgp"; > + > +static struct gpg_format_data *get_format_data(const char *str) get_format_data_by_name() or something like that, for consistency with the next function, perhaps? > +{ > + int i; > + for (i = 0; i < ARRAY_SIZE(gpg_formats); i++) > + if (!strcasecmp(gpg_formats[i].format, str)) > + return gpg_formats + i; > + return NULL; > +} > + > +static struct gpg_format_data *get_format_data_by_sig(const char *sig) > +{ > + int i, j; > + for (i = 0; i < ARRAY_SIZE(gpg_formats); i++) > + for (j = 0; j < ARRAY_SIZE(gpg_formats[i].sigs); j++) > + if (gpg_formats[i].sigs[j] && > + !strncmp(gpg_formats[i].sigs[j], sig, > + strlen(gpg_formats[i].sigs[j]))) > + return gpg_formats + i; > + return NULL; > +} > + > void signature_check_clear(struct signature_check *sigc) > { > FREE_AND_NULL(sigc->payload); > @@ -104,8 +138,7 @@ void print_signature_buffer(const struct signature_check > *sigc, unsigned flags) > > static int is_gpg_start(const char *line) > { > - return starts_with(line, PGP_SIGNATURE) || > - starts_with(line, PGP_MESSAGE); > + return (get_format_data_by_sig(line) != NULL); If this is still called "is_gpg_start()", then shouldn't the implementation be more like this? struct gpg_format *found = get_format_data_by_sig(line); return found && found == _formats[PGP_FMT]; It probably does not make much difference in the end, as you won't have functions is_(gpg|gpgsm|somethingelse|yetanother)_start() but either have a single function "does some x509 looking block start here?" or "I am expecting gpg and nothing else, does a block for that start here?" and at that point this will no longer be called is_gpg_start(), I would expect. > size_t parse_signature(const char *buf, size_t size) > @@ -140,18 +173,14 @@ int git_gpg_config(const char *var, const char *value, > void *cb) > } > > if (!strcmp(var, "gpg.format")) { > - if (strcasecmp(value, "openpgp")) > + if (!get_format_data(value)) > return error("malformed value for %s: %s", var, value); > return git_config_string(_format, var, value); This is a bug introduced in an earlier step in the series, but you will segfault when given a configuration that looks like this: [gpg] format Imitate the way how !value is diagnosed upfront for "gpg.program" below before this patch. > } > > - if (!strcmp(var, "gpg.program")) { > - if (!value) > - return config_error_nonbool(var); > - gpg_program = xstrdup(value); > - return 0; > - } > - > + if (!strcmp(var, "gpg.program")) > + return git_config_string(_formats[PGP_FMT].program, var, > + value); git_config_string() has its own "do not feed boolean 'true' to me" check, so this change does not introduce a regression. > @@ -165,12 +194,16 @@ const char *get_signing_key(void) > int sign_buffer(struct strbuf *buffer, struct strbuf *signature, const char > *signing_key) > { > struct child_process gpg = CHILD_PROCESS_INIT; > + struct gpg_format_data *fmt; > int ret; > size_t i, j, bottom; > struct strbuf gpg_status = STRBUF_INIT; > > + fmt =
Re: [PATCH v2 5/9] gpg-interface: introduce an abstraction for multiple gpg formats
On Tue, Jul 10, 2018 at 10:52:27AM +0200, Henning Schild wrote: > Create a struct that holds the format details for the supported formats. > At the moment that is still just "openpgp". This commit prepares for the > introduction of more formats, that might use other programs and match > other signatures. Great, this looks like a good incremental step. > static char *configured_signing_key; > -static const char *gpg_format = "openpgp"; > -static const char *gpg_program = "gpg"; > +struct gpg_format_data { > + const char *format; > + const char *program; > + const char *extra_args_verify[1]; > + const char *sigs[2]; > +}; These magic numbers are at a weird distance from where we fill them in: > +struct gpg_format_data gpg_formats[] = { > + { .format = "openpgp", .program = "gpg", > + .extra_args_verify = { "--keyid-format=long" }, > + .sigs = { PGP_SIGNATURE, PGP_MESSAGE } > + }, > +}; I'm not sure if we can easily do any better in C, though. Declaring the struct with an open-ended "[]" would make the compiler unhappy. We could do something like: struct gpg_format_data { ... const char **extra_args_verify; }; ... static const char *openpgp_verify_args[] = { "--key-id-format=long" }; ... static struct gpg_format_data gpg_formats[] = { { ... .extra_args_verify = openpgp_verify_args } }; I'm not sure if that's more horrible or less. It's worse to write in the first place, but it's slightly easier to maintain going forward. I dunno. > +enum gpgformats { PGP_FMT }; Looks like we use this only for indexing the gpg_formats array. I know that C guarantees 0-indexing, but if we're depending on it, it might be worth writing out "PGP_FMT = 0" explicitly. And probably adding a comment that this needs to remain in sync with the array. The other alternative is that we could simply use get_format_data("openpgp"), though that does add a minor runtime cost. > +struct gpg_format_data gpg_formats[] = { > + { .format = "openpgp", .program = "gpg", > + .extra_args_verify = { "--keyid-format=long" }, > + .sigs = { PGP_SIGNATURE, PGP_MESSAGE } > + }, > +}; This array should be marked static, I think. > +static struct gpg_format_data *get_format_data(const char *str) > +{ > + int i; > + for (i = 0; i < ARRAY_SIZE(gpg_formats); i++) > + if (!strcasecmp(gpg_formats[i].format, str)) > + return gpg_formats + i; > + return NULL; > +} This looks much nicer than the assert()-ing version from v1. > +static struct gpg_format_data *get_format_data_by_sig(const char *sig) > +{ > + int i, j; > + for (i = 0; i < ARRAY_SIZE(gpg_formats); i++) > + for (j = 0; j < ARRAY_SIZE(gpg_formats[i].sigs); j++) > + if (gpg_formats[i].sigs[j] && > + !strncmp(gpg_formats[i].sigs[j], sig, > + strlen(gpg_formats[i].sigs[j]))) > + return gpg_formats + i; > + return NULL; > +} This might be a little more readable with: starts_with(sig, gpg_formats[i].sigs[j]) instead of the strncmp. It may also be more efficient, as we don't have to compute the strlen of the prefix for each non-matching line (the compiler _might_ be smart enough to realize these are all string literals, but it's pretty buried). I also wondered if our prefix matching here is overly loose. We have to do a prefix match, since "sig" isn't terminated at the line buffer. So I think we'd match: --- BEGIN PGP MESSAGE --- AND SOME OTHER STUFF --- on a line. But I think that's no different than the current code. If we care, I guess we could look for '\n' or '\0' immediately after. > static int is_gpg_start(const char *line) > { > - return starts_with(line, PGP_SIGNATURE) || > - starts_with(line, PGP_MESSAGE); > + return (get_format_data_by_sig(line) != NULL); > } I don't know if we've ever discussed this style explicitly, but we'd usually omit the unnecessary parentheses for the return here. > @@ -140,18 +173,14 @@ int git_gpg_config(const char *var, const char *value, > void *cb) > } > > if (!strcmp(var, "gpg.format")) { > - if (strcasecmp(value, "openpgp")) > + if (!get_format_data(value)) > return error("malformed value for %s: %s", var, value); > return git_config_string(_format, var, value); > } Much nicer than v1. > @@ -165,12 +194,16 @@ const char *get_signing_key(void) > int sign_buffer(struct strbuf *buffer, struct strbuf *signature, const char > *signing_key) > { > struct child_process gpg = CHILD_PROCESS_INIT; > + struct gpg_format_data *fmt; > int ret; > size_t i, j, bottom; > struct strbuf gpg_status = STRBUF_INIT; > > + fmt = get_format_data(gpg_format); > + if (!fmt) > + BUG("bad gpg_format '%s'", gpg_format);