Re: [PATCH v2 5/9] gpg-interface: introduce an abstraction for multiple gpg formats

2018-07-13 Thread Henning Schild
Am Tue, 10 Jul 2018 12:23:32 -0400
schrieb Jeff King :

> On Tue, Jul 10, 2018 at 10:52:27AM +0200, Henning Schild wrote:
> 
> > Create a struct that holds the format details for the supported
> > formats. At the moment that is still just "openpgp". This commit
> > prepares for the introduction of more formats, that might use other
> > programs and match other signatures.  
> 
> Great, this looks like a good incremental step.
> 
> >  static char *configured_signing_key;
> > -static const char *gpg_format = "openpgp";
> > -static const char *gpg_program = "gpg";
> > +struct gpg_format_data {
> > +   const char *format;
> > +   const char *program;
> > +   const char *extra_args_verify[1];
> > +   const char *sigs[2];
> > +};  
> 
> These magic numbers are at a weird distance from where we fill them
> in:
> 
> > +struct gpg_format_data gpg_formats[] = {
> > +   { .format = "openpgp", .program = "gpg",
> > + .extra_args_verify = { "--keyid-format=long" },
> > + .sigs = { PGP_SIGNATURE, PGP_MESSAGE }
> > +   },
> > +};  
> 
> I'm not sure if we can easily do any better in C, though. Declaring
> the struct with an open-ended "[]" would make the compiler unhappy.
> We could do something like:
> 
>   struct gpg_format_data {
>   ...
>   const char **extra_args_verify;
>   };
>   ...
>   static const char *openpgp_verify_args[] = {
>   "--key-id-format=long"
>   };
>   ...
>   static struct gpg_format_data gpg_formats[] = {
>   { ...
> .extra_args_verify = openpgp_verify_args
>   }
>   };
> 
> I'm not sure if that's more horrible or less. It's worse to write in
> the first place, but it's slightly easier to maintain going forward. I
> dunno.

I switched to that, looks good but i am also not sure which one is
better.

> > +enum gpgformats { PGP_FMT };  
> 
> Looks like we use this only for indexing the gpg_formats array. I know
> that C guarantees 0-indexing, but if we're depending on it, it might
> be worth writing out "PGP_FMT = 0" explicitly. And probably adding a
> comment that this needs to remain in sync with the array.
> 
> The other alternative is that we could simply use
> get_format_data("openpgp"), though that does add a minor runtime cost.

Thanks, i got rid of the enum and use get_format_data with two fixed
strings for the two possible matches.

> > +struct gpg_format_data gpg_formats[] = {
> > +   { .format = "openpgp", .program = "gpg",
> > + .extra_args_verify = { "--keyid-format=long" },
> > + .sigs = { PGP_SIGNATURE, PGP_MESSAGE }
> > +   },
> > +};  
> 
> This array should be marked static, I think.

Yes.

> > +static struct gpg_format_data *get_format_data(const char *str)
> > +{
> > +   int i;
> > +   for (i = 0; i < ARRAY_SIZE(gpg_formats); i++)
> > +   if (!strcasecmp(gpg_formats[i].format, str))
> > +   return gpg_formats + i;
> > +   return NULL;
> > +}  
> 
> This looks much nicer than the assert()-ing version from v1.
> 
> > +static struct gpg_format_data *get_format_data_by_sig(const char
> > *sig) +{
> > +   int i, j;
> > +   for (i = 0; i < ARRAY_SIZE(gpg_formats); i++)
> > +   for (j = 0; j < ARRAY_SIZE(gpg_formats[i].sigs);
> > j++)
> > +   if (gpg_formats[i].sigs[j] && 
> > +   !strncmp(gpg_formats[i].sigs[j], sig,
> > +
> > strlen(gpg_formats[i].sigs[j])))
> > +   return gpg_formats + i;
> > +   return NULL;
> > +}  
> 
> This might be a little more readable with:
> 
>   starts_with(sig, gpg_formats[i].sigs[j])
> 
> instead of the strncmp. It may also be more efficient, as we don't
> have to compute the strlen of the prefix for each non-matching line
> (the compiler _might_ be smart enough to realize these are all string
> literals, but it's pretty buried).

Thanks, will do.

> I also wondered if our prefix matching here is overly loose. We have
> to do a prefix match, since "sig" isn't terminated at the line
> buffer. So I think we'd match:
> 
>   --- BEGIN PGP MESSAGE --- AND SOME OTHER STUFF ---
> 
> on a line. But I think that's no different than the current code. If
> we care, I guess we could look for '\n' or '\0' immediately after.

Let us not care, just like current code.

> >  static int is_gpg_start(const char *line)
> >  {
> > -   return starts_with(line, PGP_SIGNATURE) ||
> > -   starts_with(line, PGP_MESSAGE);
> > +   return (get_format_data_by_sig(line) != NULL);
> >  }  
> 
> I don't know if we've ever discussed this style explicitly, but we'd
> usually omit the unnecessary parentheses for the return here.

Will remove those braces.

> > @@ -140,18 +173,14 @@ int git_gpg_config(const char *var, const
> > char *value, void *cb) }
> >  
> > if (!strcmp(var, "gpg.format")) {
> > -   if (strcasecmp(value, "openpgp"))
> > +   if (!get_format_data(value))
> > return error("malformed value for %s: %s",
> > var, value); return git_config_string(_format, var, value);
> > }  
> 
> Much nicer 

Re: [PATCH v2 5/9] gpg-interface: introduce an abstraction for multiple gpg formats

2018-07-13 Thread Henning Schild
Am Tue, 10 Jul 2018 10:16:39 -0700
schrieb Junio C Hamano :

> Henning Schild  writes:
> 
> > Create a struct that holds the format details for the supported
> > formats. At the moment that is still just "openpgp". This commit
> > prepares for the introduction of more formats, that might use other
> > programs and match other signatures.
> >
> > Signed-off-by: Henning Schild 
> > ---
> >  gpg-interface.c | 74
> > - 1 file
> > changed, 58 insertions(+), 16 deletions(-)
> >
> > diff --git a/gpg-interface.c b/gpg-interface.c
> > index ed0e55917..0a8d1bff3 100644
> > --- a/gpg-interface.c
> > +++ b/gpg-interface.c
> > @@ -7,12 +7,46 @@
> >  #include "tempfile.h"
> >  
> >  static char *configured_signing_key;
> > -static const char *gpg_format = "openpgp";
> > -static const char *gpg_program = "gpg";
> > +struct gpg_format_data {
> > +   const char *format;
> > +   const char *program;
> > +   const char *extra_args_verify[1];
> > +   const char *sigs[2];
> > +};  
> 
> Do you use identifier "gpg_format" elsewhere as a structure name?  A
> structure is there always to hold "data", so often "_data" suffix is
> meaningless in a structure type name like this.

i renamed the struct to "gpg_format" and what was "gpg_format" before,
a const char* holding the format we use for signing, is now a pointer
into gpg_formats and called "current_format"
 
> >  #define PGP_SIGNATURE "-BEGIN PGP SIGNATURE-"
> >  #define PGP_MESSAGE "-BEGIN PGP MESSAGE-"
> >  
> > +enum gpgformats { PGP_FMT };
> > +struct gpg_format_data gpg_formats[] = {
> > +   { .format = "openpgp", .program = "gpg",
> > + .extra_args_verify = { "--keyid-format=long" },
> > + .sigs = { PGP_SIGNATURE, PGP_MESSAGE }
> > +   },
> > +};
> > +static const char *gpg_format = "openpgp";
> > +
> > +static struct gpg_format_data *get_format_data(const char *str)  
> 
> get_format_data_by_name() or something like that, for consistency
> with the next function, perhaps?

I went for get_format_by_name and also renamed "format" inside the
struct to be "name"

> > +{
> > +   int i;
> > +   for (i = 0; i < ARRAY_SIZE(gpg_formats); i++)
> > +   if (!strcasecmp(gpg_formats[i].format, str))
> > +   return gpg_formats + i;
> > +   return NULL;
> > +}
> > +
> > +static struct gpg_format_data *get_format_data_by_sig(const char
> > *sig) +{
> > +   int i, j;
> > +   for (i = 0; i < ARRAY_SIZE(gpg_formats); i++)
> > +   for (j = 0; j < ARRAY_SIZE(gpg_formats[i].sigs);
> > j++)
> > +   if (gpg_formats[i].sigs[j] && 
> > +   !strncmp(gpg_formats[i].sigs[j], sig,
> > +
> > strlen(gpg_formats[i].sigs[j])))
> > +   return gpg_formats + i;
> > +   return NULL;
> > +}
> > +
> >  void signature_check_clear(struct signature_check *sigc)
> >  {
> > FREE_AND_NULL(sigc->payload);
> > @@ -104,8 +138,7 @@ void print_signature_buffer(const struct
> > signature_check *sigc, unsigned flags) 
> >  static int is_gpg_start(const char *line)
> >  {
> > -   return starts_with(line, PGP_SIGNATURE) ||
> > -   starts_with(line, PGP_MESSAGE);
> > +   return (get_format_data_by_sig(line) != NULL);  
> 
> If this is still called "is_gpg_start()", then shouldn't the
> implementation be more like this?
> 
>   struct gpg_format *found = get_format_data_by_sig(line);
> 
>   return found && found == _formats[PGP_FMT];
> 
> It probably does not make much difference in the end, as you won't
> have functions is_(gpg|gpgsm|somethingelse|yetanother)_start() but
> either have a single function "does some x509 looking block start
> here?" or "I am expecting gpg and nothing else, does a block for
> that start here?" and at that point this will no longer be called
> is_gpg_start(), I would expect.

I dropped the whole thing since it is now a one-liner and only called
on one place.

> >  size_t parse_signature(const char *buf, size_t size)
> > @@ -140,18 +173,14 @@ int git_gpg_config(const char *var, const
> > char *value, void *cb) }
> >  
> > if (!strcmp(var, "gpg.format")) {
> > -   if (strcasecmp(value, "openpgp"))
> > +   if (!get_format_data(value))
> > return error("malformed value for %s: %s",
> > var, value); return git_config_string(_format, var, value);  
> 
> This is a bug introduced in an earlier step in the series, but you
> will segfault when given a configuration that looks like this:
> 
>   [gpg]
>   format
> 
> Imitate the way how !value is diagnosed upfront for "gpg.program"
> below before this patch.

The next version will take care of this.

> > }
> >  
> > -   if (!strcmp(var, "gpg.program")) {
> > -   if (!value)
> > -   return config_error_nonbool(var);
> > -   gpg_program = xstrdup(value);
> > -   return 0;
> > -   }
> > -
> > +   if (!strcmp(var, "gpg.program"))
> > +   return
> > 

Re: [PATCH v2 5/9] gpg-interface: introduce an abstraction for multiple gpg formats

2018-07-10 Thread Junio C Hamano
Henning Schild  writes:

> Create a struct that holds the format details for the supported formats.
> At the moment that is still just "openpgp". This commit prepares for the
> introduction of more formats, that might use other programs and match
> other signatures.
>
> Signed-off-by: Henning Schild 
> ---
>  gpg-interface.c | 74 
> -
>  1 file changed, 58 insertions(+), 16 deletions(-)
>
> diff --git a/gpg-interface.c b/gpg-interface.c
> index ed0e55917..0a8d1bff3 100644
> --- a/gpg-interface.c
> +++ b/gpg-interface.c
> @@ -7,12 +7,46 @@
>  #include "tempfile.h"
>  
>  static char *configured_signing_key;
> -static const char *gpg_format = "openpgp";
> -static const char *gpg_program = "gpg";
> +struct gpg_format_data {
> + const char *format;
> + const char *program;
> + const char *extra_args_verify[1];
> + const char *sigs[2];
> +};

Do you use identifier "gpg_format" elsewhere as a structure name?  A
structure is there always to hold "data", so often "_data" suffix is
meaningless in a structure type name like this.

>  #define PGP_SIGNATURE "-BEGIN PGP SIGNATURE-"
>  #define PGP_MESSAGE "-BEGIN PGP MESSAGE-"
>  
> +enum gpgformats { PGP_FMT };
> +struct gpg_format_data gpg_formats[] = {
> + { .format = "openpgp", .program = "gpg",
> +   .extra_args_verify = { "--keyid-format=long" },
> +   .sigs = { PGP_SIGNATURE, PGP_MESSAGE }
> + },
> +};
> +static const char *gpg_format = "openpgp";
> +
> +static struct gpg_format_data *get_format_data(const char *str)

get_format_data_by_name() or something like that, for consistency
with the next function, perhaps?

> +{
> + int i;
> + for (i = 0; i < ARRAY_SIZE(gpg_formats); i++)
> + if (!strcasecmp(gpg_formats[i].format, str))
> + return gpg_formats + i;
> + return NULL;
> +}
> +
> +static struct gpg_format_data *get_format_data_by_sig(const char *sig)
> +{
> + int i, j;
> + for (i = 0; i < ARRAY_SIZE(gpg_formats); i++)
> + for (j = 0; j < ARRAY_SIZE(gpg_formats[i].sigs); j++)
> + if (gpg_formats[i].sigs[j] && 
> + !strncmp(gpg_formats[i].sigs[j], sig,
> +  strlen(gpg_formats[i].sigs[j])))
> + return gpg_formats + i;
> + return NULL;
> +}
> +
>  void signature_check_clear(struct signature_check *sigc)
>  {
>   FREE_AND_NULL(sigc->payload);
> @@ -104,8 +138,7 @@ void print_signature_buffer(const struct signature_check 
> *sigc, unsigned flags)
>  
>  static int is_gpg_start(const char *line)
>  {
> - return starts_with(line, PGP_SIGNATURE) ||
> - starts_with(line, PGP_MESSAGE);
> + return (get_format_data_by_sig(line) != NULL);

If this is still called "is_gpg_start()", then shouldn't the
implementation be more like this?

struct gpg_format *found = get_format_data_by_sig(line);

return found && found == _formats[PGP_FMT];

It probably does not make much difference in the end, as you won't
have functions is_(gpg|gpgsm|somethingelse|yetanother)_start() but
either have a single function "does some x509 looking block start
here?" or "I am expecting gpg and nothing else, does a block for
that start here?" and at that point this will no longer be called
is_gpg_start(), I would expect.

>  size_t parse_signature(const char *buf, size_t size)
> @@ -140,18 +173,14 @@ int git_gpg_config(const char *var, const char *value, 
> void *cb)
>   }
>  
>   if (!strcmp(var, "gpg.format")) {
> - if (strcasecmp(value, "openpgp"))
> + if (!get_format_data(value))
>   return error("malformed value for %s: %s", var, value);
>   return git_config_string(_format, var, value);

This is a bug introduced in an earlier step in the series, but you
will segfault when given a configuration that looks like this:

[gpg]
format

Imitate the way how !value is diagnosed upfront for "gpg.program"
below before this patch.

>   }
>  
> - if (!strcmp(var, "gpg.program")) {
> - if (!value)
> - return config_error_nonbool(var);
> - gpg_program = xstrdup(value);
> - return 0;
> - }
> -
> + if (!strcmp(var, "gpg.program"))
> + return git_config_string(_formats[PGP_FMT].program, var,
> +  value);

git_config_string() has its own "do not feed boolean 'true' to me"
check, so this change does not introduce a regression.

> @@ -165,12 +194,16 @@ const char *get_signing_key(void)
>  int sign_buffer(struct strbuf *buffer, struct strbuf *signature, const char 
> *signing_key)
>  {
>   struct child_process gpg = CHILD_PROCESS_INIT;
> + struct gpg_format_data *fmt;
>   int ret;
>   size_t i, j, bottom;
>   struct strbuf gpg_status = STRBUF_INIT;
>  
> + fmt = 

Re: [PATCH v2 5/9] gpg-interface: introduce an abstraction for multiple gpg formats

2018-07-10 Thread Jeff King
On Tue, Jul 10, 2018 at 10:52:27AM +0200, Henning Schild wrote:

> Create a struct that holds the format details for the supported formats.
> At the moment that is still just "openpgp". This commit prepares for the
> introduction of more formats, that might use other programs and match
> other signatures.

Great, this looks like a good incremental step.

>  static char *configured_signing_key;
> -static const char *gpg_format = "openpgp";
> -static const char *gpg_program = "gpg";
> +struct gpg_format_data {
> + const char *format;
> + const char *program;
> + const char *extra_args_verify[1];
> + const char *sigs[2];
> +};

These magic numbers are at a weird distance from where we fill them in:

> +struct gpg_format_data gpg_formats[] = {
> + { .format = "openpgp", .program = "gpg",
> +   .extra_args_verify = { "--keyid-format=long" },
> +   .sigs = { PGP_SIGNATURE, PGP_MESSAGE }
> + },
> +};

I'm not sure if we can easily do any better in C, though. Declaring the
struct with an open-ended "[]" would make the compiler unhappy. We could
do something like:

  struct gpg_format_data {
...
const char **extra_args_verify;
  };
  ...
  static const char *openpgp_verify_args[] = {
"--key-id-format=long"
  };
  ...
  static struct gpg_format_data gpg_formats[] = {
{ ...
  .extra_args_verify = openpgp_verify_args
}
  };

I'm not sure if that's more horrible or less. It's worse to write in the
first place, but it's slightly easier to maintain going forward. I
dunno.

> +enum gpgformats { PGP_FMT };

Looks like we use this only for indexing the gpg_formats array. I know
that C guarantees 0-indexing, but if we're depending on it, it might be
worth writing out "PGP_FMT = 0" explicitly. And probably adding a
comment that this needs to remain in sync with the array.

The other alternative is that we could simply use
get_format_data("openpgp"), though that does add a minor runtime cost.

> +struct gpg_format_data gpg_formats[] = {
> + { .format = "openpgp", .program = "gpg",
> +   .extra_args_verify = { "--keyid-format=long" },
> +   .sigs = { PGP_SIGNATURE, PGP_MESSAGE }
> + },
> +};

This array should be marked static, I think.

> +static struct gpg_format_data *get_format_data(const char *str)
> +{
> + int i;
> + for (i = 0; i < ARRAY_SIZE(gpg_formats); i++)
> + if (!strcasecmp(gpg_formats[i].format, str))
> + return gpg_formats + i;
> + return NULL;
> +}

This looks much nicer than the assert()-ing version from v1.

> +static struct gpg_format_data *get_format_data_by_sig(const char *sig)
> +{
> + int i, j;
> + for (i = 0; i < ARRAY_SIZE(gpg_formats); i++)
> + for (j = 0; j < ARRAY_SIZE(gpg_formats[i].sigs); j++)
> + if (gpg_formats[i].sigs[j] && 
> + !strncmp(gpg_formats[i].sigs[j], sig,
> +  strlen(gpg_formats[i].sigs[j])))
> + return gpg_formats + i;
> + return NULL;
> +}

This might be a little more readable with:

  starts_with(sig, gpg_formats[i].sigs[j])

instead of the strncmp. It may also be more efficient, as we don't have
to compute the strlen of the prefix for each non-matching line (the
compiler _might_ be smart enough to realize these are all string
literals, but it's pretty buried).

I also wondered if our prefix matching here is overly loose. We have to
do a prefix match, since "sig" isn't terminated at the line buffer. So I
think we'd match:

  --- BEGIN PGP MESSAGE --- AND SOME OTHER STUFF ---

on a line. But I think that's no different than the current code. If we
care, I guess we could look for '\n' or '\0' immediately after.

>  static int is_gpg_start(const char *line)
>  {
> - return starts_with(line, PGP_SIGNATURE) ||
> - starts_with(line, PGP_MESSAGE);
> + return (get_format_data_by_sig(line) != NULL);
>  }

I don't know if we've ever discussed this style explicitly, but we'd
usually omit the unnecessary parentheses for the return here.

> @@ -140,18 +173,14 @@ int git_gpg_config(const char *var, const char *value, 
> void *cb)
>   }
>  
>   if (!strcmp(var, "gpg.format")) {
> - if (strcasecmp(value, "openpgp"))
> + if (!get_format_data(value))
>   return error("malformed value for %s: %s", var, value);
>   return git_config_string(_format, var, value);
>   }

Much nicer than v1.

> @@ -165,12 +194,16 @@ const char *get_signing_key(void)
>  int sign_buffer(struct strbuf *buffer, struct strbuf *signature, const char 
> *signing_key)
>  {
>   struct child_process gpg = CHILD_PROCESS_INIT;
> + struct gpg_format_data *fmt;
>   int ret;
>   size_t i, j, bottom;
>   struct strbuf gpg_status = STRBUF_INIT;
>  
> + fmt = get_format_data(gpg_format);
> + if (!fmt)
> + BUG("bad gpg_format '%s'", gpg_format);