Re: [PATCH v8 1/2] add `config_set` API for caching config-like files

2014-07-15 Thread Tanay Abhra


On 7/11/2014 7:51 PM, Matthieu Moy wrote:
 +`int git_configset_add_file(struct config_set *cs, const char *filename)`::
 +
 +Parses the file and adds the variable-value pairs to the `config_set`,
 +dies if there is an error in parsing the file.
 
 The return value is undocumented.
 
 If I read correctly, the only codepath from this to a syntax error sets
 die_on_error, hence dies if there is an error in parsing the file is
 correct.
 
 Still, there are errors like unreadable file or no such file that do
 not die (nor emit any error message, which is not very good for the
 user), and lead to returning -1 here.
 
 I'm not sure this distinction is right (why die on syntax error and
 continue running on unreadable file?).
 

I checked the whole codebase and in all of the cases if they cannot read a file
they return -1 and continue running. So, I have updated the API to document the
return value.

I think if the file is unreadable. we must continue running as no harm has been
done yet, worse is parsing a file with wrong syntax which can cause reading
wrong config values. So the decision to die on syntax error sounds alright
to me.

Cheers,
Tanay Abhra.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 1/2] add `config_set` API for caching config-like files

2014-07-15 Thread Matthieu Moy
Tanay Abhra tanay...@gmail.com writes:

 I checked the whole codebase and in all of the cases if they cannot read a 
 file
 they return -1 and continue running.

More precisely: in git_config_from_file, any fopen failure results in
return -1. But in git_config_early (one caller of
git_config_from_file()), the file is checked before access, e.g.:

if (git_config_system()  !access_or_die(git_etc_gitconfig(), R_OK, 
0)) {
ret += git_config_from_file(fn, git_etc_gitconfig(),
data);
found += 1;
}

Essentially, if a config file does not exist, it's skipped (obviously
desirable), but if some really weird error occur (if err == ENOENT ||
err == ENOTDIR || ((flag  ACCESS_EACCES_OK)  err == EACCES is false,
from access_eacces_ok() in wrapper.c), then the process dies.

Permission denied errors are allowed for user-wide config, but not for
others. Read the log for commit 4698c8feb for more details.

Anyway, this is the part of the code you're not touching.

(I actually consider it as a bug that git config --file no-such-file
foo.bar and git config --file unreadable-file foo.bar fail without
error message, but your commit does not change this).

 I think if the file is unreadable. we must continue running as no harm has 
 been
 done yet, worse is parsing a file with wrong syntax which can cause reading
 wrong config values. So the decision to die on syntax error sounds alright
 to me.

In the case of git_config_check_init(), I think it makes sense to die,
because file accesses are protected with access_or_die(), so the return
value can be negative only if something really went wrong.

If you chose not to die, then you should check the return value in the
callers of git_config_check_init().

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 1/2] add `config_set` API for caching config-like files

2014-07-11 Thread Matthieu Moy
Hi,

I had a closer look at error management (once more, sorry: I should have
done this earlier...), and it seems to me that:

* Not all errors are managed properly

* Most error cases are untested

Among the cases I can think of:

* Syntax error when parsing the file

* Non-existant file

* Unreadable file (chmod -r)

Tanay Abhra tanay...@gmail.com writes:

 +`int git_configset_add_file(struct config_set *cs, const char *filename)`::
 +
 + Parses the file and adds the variable-value pairs to the `config_set`,
 + dies if there is an error in parsing the file.

The return value is undocumented.

If I read correctly, the only codepath from this to a syntax error sets
die_on_error, hence dies if there is an error in parsing the file is
correct.

Still, there are errors like unreadable file or no such file that do
not die (nor emit any error message, which is not very good for the
user), and lead to returning -1 here.

I'm not sure this distinction is right (why die on syntax error and
continue running on unreadable file?).

In any case, it should be documented and tested. I'll send a fixup patch
with a few more example tests (probably insufficient).

 +static int git_config_check_init(void)
 +{
 + int ret = 0;
 + if (the_config_set.hash_initialized)
 + return 0;
 + configset_init(the_config_set);
 + ret = git_config(config_hash_callback, the_config_set);
 + if (ret = 0)
 + return 0;
 + else {
 + hashmap_free(the_config_set.config_hash, 1);
 + the_config_set.hash_initialized = 0;
 + return -1;
 + }
 +}

We have the same convention for errors here. But a more serious issue is
that the return value of this function is ignored most of the time.

It seems git_config should never return a negative value, as it calls
git_config_with_options - git_config_early, which checks for file
existance and permission before calling git_config_from_file. Indeed,
Git's tests still pass after this:

--- a/config.c
+++ b/config.c
@@ -1225,7 +1225,10 @@ int git_config_with_options(config_fn_t fn, void *data,
 
 int git_config(config_fn_t fn, void *data)
 {
-   return git_config_with_options(fn, data, NULL, 1);
+   int ret = git_config_with_options(fn, data, NULL, 1);
+   if (ret  0)
+   die(Negative return value in git_config);
+   return ret;
 }

Still, we can imagine cases like race condition between access_or_die()
and git_config_from_file() in

if (git_config_system()  !access_or_die(git_etc_gitconfig(), R_OK, 
0)) {
ret += git_config_from_file(fn, git_etc_gitconfig(),
data);
found += 1;
}

where the function would indeed return -1. In any case, either we
consider that git_config should never return -1, and we should die in
this case, or we consider that it may happen, and that the else branch
of the if/else above is not dead code, and then we can't just ignore the
return value.

I think we should just do something like this:

diff --git a/config.c b/config.c
index 74adbbd..5c023e8 100644
--- a/config.c
+++ b/config.c
@@ -1428,7 +1428,7 @@ int git_configset_get_pathname(struct config_set *cs, 
const char *key, const cha
return 1;
 }
 
-static int git_config_check_init(void)
+static void git_config_check_init(void)
 {
int ret = 0;
if (the_config_set.hash_initialized)
@@ -1437,11 +1437,8 @@ static int git_config_check_init(void)
ret = git_config(config_hash_callback, the_config_set);
if (ret = 0)
return 0;
-   else {
-   hashmap_free(the_config_set.config_hash, 1);
-   the_config_set.hash_initialized = 0;
-   return -1;
-   }
+   else
+   die(Unknown error when parsing one of the configuration 
files.);
 }
 
If not, a comment should explain what the else branch corresponds to,
and why/when the return value can be safely ignored.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 1/2] add `config_set` API for caching config-like files

2014-07-11 Thread Junio C Hamano
Tanay Abhra tanay...@gmail.com writes:

 diff --git a/config.c b/config.c
 index ba882a1..aa58275 100644
 --- a/config.c
 +++ b/config.c
 @@ -9,6 +9,8 @@
  #include exec_cmd.h
  #include strbuf.h
  #include quote.h
 +#include hashmap.h
 +#include string-list.h
  
  struct config_source {
   struct config_source *prev;
 @@ -33,10 +35,23 @@ struct config_source {
   long (*do_ftell)(struct config_source *c);
  };
  
 +struct config_hash_entry {
 + struct hashmap_entry ent;
 + char *key;
 + struct string_list value_list;
 +};
 +
  static struct config_source *cf;
  
  static int zlib_compression_seen;
  
 +/*
 + * Default config_set that contains key-value pairs from the usual set of 
 config
 + * config files (i.e repo specific .git/config, user wide ~/.gitconfig, XDG
 + * config file and the global /etc/gitconfig)
 + */
 +static struct config_set the_config_set;
 +
  static int config_file_fgetc(struct config_source *conf)
  {
   return fgetc(conf-u.file);
 @@ -1212,6 +1227,284 @@ int git_config(config_fn_t fn, void *data)
   return git_config_with_options(fn, data, NULL, 1);
  }
  
 +static int config_hash_add_value(const char *, const char *, struct hashmap 
 *);

This is a funny forward declaration.  If you define add-file and
friends that modify the config-set _after_ you define find-entry and
friends that are used to look-up, would you still need to do a
forward declaration?

In any case, please give names to these first two parameters as what
they are for can be unclear; because they are of the same type,
there is one less clue than there usually are.

The function signature makes it sound as if this is not specific to
config; what takes a hashmap, a key and a value and is called add
is a function to add key, value pair to a hashmap.  Why doesn't it
take struct config_set?  In other words, I would have expected

static int config_set_add(struct config_set *, const char *key, const char 
*value)

instead.  Not a complaint, but is puzzled why you chose not to do
so.

I suspect almost everywhere in this patch, you would want to do
s/config_hash/config_set/.  s/config_hash_entry/config_set_element/
might be a good idea, too.  You have the concept of the config
set, and each element of that set is a config-set element, not a
config-hash entry, isn't it?

 +static int config_hash_entry_cmp(const struct config_hash_entry *e1,
 +  const struct config_hash_entry *e2, const void 
 *unused)
 +{
 + return strcmp(e1-key, e2-key);
 +}
 +
 +static void configset_init(struct config_set *cs)
 +{
 + if (!cs-hash_initialized) {
 + hashmap_init(cs-config_hash, 
 (hashmap_cmp_fn)config_hash_entry_cmp, 0);
 + cs-hash_initialized = 1;
 + }
 +}

Have uninitializer here, immediately after you defined the initializer.

 +static int config_hash_callback(const char *key, const char *value, void *cb)
 +{
 + struct config_set *cs = cb;
 + config_hash_add_value(key, value, cs-config_hash);
 + return 0;
 +}
 +
 +int git_configset_add_file(struct config_set *cs, const char *filename)
 +{
 + int ret = 0;
 + configset_init(cs);
 + ret = git_config_from_file(config_hash_callback, filename, cs);
 + if (!ret)
 + return 0;
 + else {
 + hashmap_free(cs-config_hash, 1);
 + cs-hash_initialized = 0;
 + return -1;

Does calling configset_clear() do too much for the purpose of this
error path?  In other words, is it deliberate that you do not touch
any string-list items you may have accumulated by calling the
callback?

 +static struct config_hash_entry *config_hash_find_entry(const char *key,
 + struct hashmap 
 *config_hash)
 +{
 + struct config_hash_entry k;
 + struct config_hash_entry *found_entry;
 + char *normalized_key;
 + int ret;
 + /*
 +  * `key` may come from the user, so normalize it before using it
 +  * for querying entries from the hashmap.
 +  */
 + ret = git_config_parse_key(key, normalized_key, NULL);
 +
 + if (ret)
 + return NULL;
 +
 + hashmap_entry_init(k, strhash(normalized_key));
 + k.key = normalized_key;
 + found_entry = hashmap_get(config_hash, k, NULL);
 + free(normalized_key);
 + return found_entry;
 +}
 +
 +static struct string_list *configset_get_list(const char *key, struct 
 config_set *cs)
 +{
 + struct config_hash_entry *e = config_hash_find_entry(key, 
 cs-config_hash);
 + return e ? e-value_list : NULL;
 +}
 +
 +static int config_hash_add_value(const char *key, const char *value, struct 
 hashmap *config_hash)
 +{
 + struct config_hash_entry *e;
 + e = config_hash_find_entry(key, config_hash);
 + /*
 +  * Since the keys are being fed by git_config*() callback mechanism, 
 they
 +  * are already normalized. So simply add them without any further 
 munging.
 +  */
 + if (!e) {
 +