Re: [PATCH v8 1/2] add `config_set` API for caching config-like files
On 7/11/2014 7:51 PM, Matthieu Moy wrote: +`int git_configset_add_file(struct config_set *cs, const char *filename)`:: + +Parses the file and adds the variable-value pairs to the `config_set`, +dies if there is an error in parsing the file. The return value is undocumented. If I read correctly, the only codepath from this to a syntax error sets die_on_error, hence dies if there is an error in parsing the file is correct. Still, there are errors like unreadable file or no such file that do not die (nor emit any error message, which is not very good for the user), and lead to returning -1 here. I'm not sure this distinction is right (why die on syntax error and continue running on unreadable file?). I checked the whole codebase and in all of the cases if they cannot read a file they return -1 and continue running. So, I have updated the API to document the return value. I think if the file is unreadable. we must continue running as no harm has been done yet, worse is parsing a file with wrong syntax which can cause reading wrong config values. So the decision to die on syntax error sounds alright to me. Cheers, Tanay Abhra. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 1/2] add `config_set` API for caching config-like files
Tanay Abhra tanay...@gmail.com writes: I checked the whole codebase and in all of the cases if they cannot read a file they return -1 and continue running. More precisely: in git_config_from_file, any fopen failure results in return -1. But in git_config_early (one caller of git_config_from_file()), the file is checked before access, e.g.: if (git_config_system() !access_or_die(git_etc_gitconfig(), R_OK, 0)) { ret += git_config_from_file(fn, git_etc_gitconfig(), data); found += 1; } Essentially, if a config file does not exist, it's skipped (obviously desirable), but if some really weird error occur (if err == ENOENT || err == ENOTDIR || ((flag ACCESS_EACCES_OK) err == EACCES is false, from access_eacces_ok() in wrapper.c), then the process dies. Permission denied errors are allowed for user-wide config, but not for others. Read the log for commit 4698c8feb for more details. Anyway, this is the part of the code you're not touching. (I actually consider it as a bug that git config --file no-such-file foo.bar and git config --file unreadable-file foo.bar fail without error message, but your commit does not change this). I think if the file is unreadable. we must continue running as no harm has been done yet, worse is parsing a file with wrong syntax which can cause reading wrong config values. So the decision to die on syntax error sounds alright to me. In the case of git_config_check_init(), I think it makes sense to die, because file accesses are protected with access_or_die(), so the return value can be negative only if something really went wrong. If you chose not to die, then you should check the return value in the callers of git_config_check_init(). -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 1/2] add `config_set` API for caching config-like files
Hi, I had a closer look at error management (once more, sorry: I should have done this earlier...), and it seems to me that: * Not all errors are managed properly * Most error cases are untested Among the cases I can think of: * Syntax error when parsing the file * Non-existant file * Unreadable file (chmod -r) Tanay Abhra tanay...@gmail.com writes: +`int git_configset_add_file(struct config_set *cs, const char *filename)`:: + + Parses the file and adds the variable-value pairs to the `config_set`, + dies if there is an error in parsing the file. The return value is undocumented. If I read correctly, the only codepath from this to a syntax error sets die_on_error, hence dies if there is an error in parsing the file is correct. Still, there are errors like unreadable file or no such file that do not die (nor emit any error message, which is not very good for the user), and lead to returning -1 here. I'm not sure this distinction is right (why die on syntax error and continue running on unreadable file?). In any case, it should be documented and tested. I'll send a fixup patch with a few more example tests (probably insufficient). +static int git_config_check_init(void) +{ + int ret = 0; + if (the_config_set.hash_initialized) + return 0; + configset_init(the_config_set); + ret = git_config(config_hash_callback, the_config_set); + if (ret = 0) + return 0; + else { + hashmap_free(the_config_set.config_hash, 1); + the_config_set.hash_initialized = 0; + return -1; + } +} We have the same convention for errors here. But a more serious issue is that the return value of this function is ignored most of the time. It seems git_config should never return a negative value, as it calls git_config_with_options - git_config_early, which checks for file existance and permission before calling git_config_from_file. Indeed, Git's tests still pass after this: --- a/config.c +++ b/config.c @@ -1225,7 +1225,10 @@ int git_config_with_options(config_fn_t fn, void *data, int git_config(config_fn_t fn, void *data) { - return git_config_with_options(fn, data, NULL, 1); + int ret = git_config_with_options(fn, data, NULL, 1); + if (ret 0) + die(Negative return value in git_config); + return ret; } Still, we can imagine cases like race condition between access_or_die() and git_config_from_file() in if (git_config_system() !access_or_die(git_etc_gitconfig(), R_OK, 0)) { ret += git_config_from_file(fn, git_etc_gitconfig(), data); found += 1; } where the function would indeed return -1. In any case, either we consider that git_config should never return -1, and we should die in this case, or we consider that it may happen, and that the else branch of the if/else above is not dead code, and then we can't just ignore the return value. I think we should just do something like this: diff --git a/config.c b/config.c index 74adbbd..5c023e8 100644 --- a/config.c +++ b/config.c @@ -1428,7 +1428,7 @@ int git_configset_get_pathname(struct config_set *cs, const char *key, const cha return 1; } -static int git_config_check_init(void) +static void git_config_check_init(void) { int ret = 0; if (the_config_set.hash_initialized) @@ -1437,11 +1437,8 @@ static int git_config_check_init(void) ret = git_config(config_hash_callback, the_config_set); if (ret = 0) return 0; - else { - hashmap_free(the_config_set.config_hash, 1); - the_config_set.hash_initialized = 0; - return -1; - } + else + die(Unknown error when parsing one of the configuration files.); } If not, a comment should explain what the else branch corresponds to, and why/when the return value can be safely ignored. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 1/2] add `config_set` API for caching config-like files
Tanay Abhra tanay...@gmail.com writes: diff --git a/config.c b/config.c index ba882a1..aa58275 100644 --- a/config.c +++ b/config.c @@ -9,6 +9,8 @@ #include exec_cmd.h #include strbuf.h #include quote.h +#include hashmap.h +#include string-list.h struct config_source { struct config_source *prev; @@ -33,10 +35,23 @@ struct config_source { long (*do_ftell)(struct config_source *c); }; +struct config_hash_entry { + struct hashmap_entry ent; + char *key; + struct string_list value_list; +}; + static struct config_source *cf; static int zlib_compression_seen; +/* + * Default config_set that contains key-value pairs from the usual set of config + * config files (i.e repo specific .git/config, user wide ~/.gitconfig, XDG + * config file and the global /etc/gitconfig) + */ +static struct config_set the_config_set; + static int config_file_fgetc(struct config_source *conf) { return fgetc(conf-u.file); @@ -1212,6 +1227,284 @@ int git_config(config_fn_t fn, void *data) return git_config_with_options(fn, data, NULL, 1); } +static int config_hash_add_value(const char *, const char *, struct hashmap *); This is a funny forward declaration. If you define add-file and friends that modify the config-set _after_ you define find-entry and friends that are used to look-up, would you still need to do a forward declaration? In any case, please give names to these first two parameters as what they are for can be unclear; because they are of the same type, there is one less clue than there usually are. The function signature makes it sound as if this is not specific to config; what takes a hashmap, a key and a value and is called add is a function to add key, value pair to a hashmap. Why doesn't it take struct config_set? In other words, I would have expected static int config_set_add(struct config_set *, const char *key, const char *value) instead. Not a complaint, but is puzzled why you chose not to do so. I suspect almost everywhere in this patch, you would want to do s/config_hash/config_set/. s/config_hash_entry/config_set_element/ might be a good idea, too. You have the concept of the config set, and each element of that set is a config-set element, not a config-hash entry, isn't it? +static int config_hash_entry_cmp(const struct config_hash_entry *e1, + const struct config_hash_entry *e2, const void *unused) +{ + return strcmp(e1-key, e2-key); +} + +static void configset_init(struct config_set *cs) +{ + if (!cs-hash_initialized) { + hashmap_init(cs-config_hash, (hashmap_cmp_fn)config_hash_entry_cmp, 0); + cs-hash_initialized = 1; + } +} Have uninitializer here, immediately after you defined the initializer. +static int config_hash_callback(const char *key, const char *value, void *cb) +{ + struct config_set *cs = cb; + config_hash_add_value(key, value, cs-config_hash); + return 0; +} + +int git_configset_add_file(struct config_set *cs, const char *filename) +{ + int ret = 0; + configset_init(cs); + ret = git_config_from_file(config_hash_callback, filename, cs); + if (!ret) + return 0; + else { + hashmap_free(cs-config_hash, 1); + cs-hash_initialized = 0; + return -1; Does calling configset_clear() do too much for the purpose of this error path? In other words, is it deliberate that you do not touch any string-list items you may have accumulated by calling the callback? +static struct config_hash_entry *config_hash_find_entry(const char *key, + struct hashmap *config_hash) +{ + struct config_hash_entry k; + struct config_hash_entry *found_entry; + char *normalized_key; + int ret; + /* + * `key` may come from the user, so normalize it before using it + * for querying entries from the hashmap. + */ + ret = git_config_parse_key(key, normalized_key, NULL); + + if (ret) + return NULL; + + hashmap_entry_init(k, strhash(normalized_key)); + k.key = normalized_key; + found_entry = hashmap_get(config_hash, k, NULL); + free(normalized_key); + return found_entry; +} + +static struct string_list *configset_get_list(const char *key, struct config_set *cs) +{ + struct config_hash_entry *e = config_hash_find_entry(key, cs-config_hash); + return e ? e-value_list : NULL; +} + +static int config_hash_add_value(const char *key, const char *value, struct hashmap *config_hash) +{ + struct config_hash_entry *e; + e = config_hash_find_entry(key, config_hash); + /* + * Since the keys are being fed by git_config*() callback mechanism, they + * are already normalized. So simply add them without any further munging. + */ + if (!e) { +