[PATCH v9 1/2] add `config_set` API for caching config-like files

2014-07-15 Thread Tanay Abhra
Currently `git_config()` uses a callback mechanism and file rereads for
config values. Due to this approach, it is not uncommon for the config
files to be parsed several times during the run of a git program, with
different callbacks picking out different variables useful to themselves.

Add a `config_set`, that can be used to construct an in-memory cache for
config-like files that the caller specifies (i.e., files like `.gitmodules`,
`~/.gitconfig` etc.). Add two external functions `git_configset_get_value`
and `git_configset_get_value_multi` for querying from the config sets.
`git_configset_get_value` follows `last one wins` semantic (i.e. if there
are multiple matches for the queried key in the files of the configset the
value returned will be the last entry in `value_list`).
`git_configset_get_value_multi` returns a list of values sorted in order of
increasing priority (i.e. last match will be at the end of the list). Add
type specific query functions like `git_configset_get_bool` and similar.

Add a default `config_set`, `the_config_set` to cache all key-value pairs
read from usual config files (repo specific .git/config, user wide
~/.gitconfig, XDG config and the global /etc/gitconfig). `the_config_set`
is populated using `git_config()`.

Add two external functions `git_config_get_value` and 
`git_config_get_value_multi` for querying in a non-callback manner from
`the_config_set`. Also, add type specific query functions that are
implemented as a thin wrapper around the `config_set` API.

Signed-off-by: Matthieu Moy matthieu@imag.fr
Signed-off-by: Tanay Abhra tanay...@gmail.com
---
 Documentation/technical/api-config.txt | 135 
 cache.h|  31 
 config.c   | 276 +
 3 files changed, 442 insertions(+)

diff --git a/Documentation/technical/api-config.txt 
b/Documentation/technical/api-config.txt
index 230b3a0..48255a2 100644
--- a/Documentation/technical/api-config.txt
+++ b/Documentation/technical/api-config.txt
@@ -77,6 +77,81 @@ To read a specific file in git-config format, use
 `git_config_from_file`. This takes the same callback and data parameters
 as `git_config`.
 
+Querying For Specific Variables
+---
+
+For programs wanting to query for specific variables in a non-callback
+manner, the config API provides two functions `git_config_get_value`
+and `git_config_get_value_multi`. They both read values from an internal
+cache generated previously from reading the config files.
+
+`int git_config_get_value(const char *key, const char **value)`::
+
+   Finds the highest-priority value for the configuration variable `key`,
+   stores the pointer to it in `value` and returns 0. When the
+   configuration variable `key` is not found, returns 1 without touching
+   `value`. The caller should not free or modify `value`, as it is owned
+   by the cache.
+
+`const struct string_list *git_config_get_value_multi(const char *key)`::
+
+   Finds and returns the value list, sorted in order of increasing priority
+   for the configuration variable `key`. When the configuration variable
+   `key` is not found, returns NULL. The caller should not free or modify
+   the returned pointer, as it is owned by the cache.
+
+`void git_config_clear(void)`::
+
+   Resets and invalidates the config cache.
+
+The config API also provides type specific API functions which do conversion
+as well as retrieval for the queried variable, including:
+
+`int git_config_get_int(const char *key, int *dest)`::
+
+   Finds and parses the value to an integer for the configuration variable
+   `key`. Dies on error; otherwise, stores the value of the parsed integer 
in
+   `dest` and returns 0. When the configuration variable `key` is not 
found,
+   returns 1 without touching `dest`.
+
+`int git_config_get_ulong(const char *key, unsigned long *dest)`::
+
+   Similar to `git_config_get_int` but for unsigned longs.
+
+`int git_config_get_bool(const char *key, int *dest)`::
+
+   Finds and parses the value into a boolean value, for the configuration
+   variable `key` respecting keywords like true and false. Integer
+   values are converted into true/false values (when they are non-zero or
+   zero, respectively). Other values cause a die(). If parsing is 
successful,
+   stores the value of the parsed result in `dest` and returns 0. When the
+   configuration variable `key` is not found, returns 1 without touching
+   `dest`.
+
+`int git_config_get_bool_or_int(const char *key, int *is_bool, int *dest)`::
+
+   Similar to `git_config_get_bool`, except that integers are copied as-is,
+   and `is_bool` flag is unset.
+
+`int git_config_get_maybe_bool(const char *key, int *dest)`::
+
+   Similar to `git_config_get_bool`, except that it returns -1 on error
+   rather than dying.
+
+`int 

Re: [PATCH v9 1/2] add `config_set` API for caching config-like files

2014-07-15 Thread Junio C Hamano
Tanay Abhra tanay...@gmail.com writes:

 diff --git a/config.c b/config.c
 index ba882a1..89e2d67 100644
 --- a/config.c
 +++ b/config.c
 ...
 +const struct string_list *git_configset_get_value_multi(struct config_set 
 *cs, const char *key)
 +{
 + struct config_set_element *e = configset_find_element(cs, key);
 + return e ? e-value_list : NULL;
 +}

Somehow I find the placement of this function out of place.  Didn't
you get the same impression while proofreading the entire file after
you are done writing this patch?

The right place for it would probably be immediately before
git_configset_get_value(), I would think.

 +int git_configset_add_file(struct config_set *cs, const char *filename)
 +{
 + int ret = 0;
 + ret = git_config_from_file(config_hash_callback, filename, cs);
 + if (!ret)
 + return 0;
 + else {
 + git_configset_clear(cs);
 + return -1;
 + }
 +}

If I add contents from file A successfully and then attempt to
further add contents from file B which fails, I would lose
contents I read from A?

It would not be worth trying to make it transactional (i.e. making
sure cs contains what was there before the config-from-file was
called if it failed), so in such a case we may end up seeing a
mixture of full contents from A and partial contents from B, which
is just as useless as a cleared configset, so it is not wrong to
clear it per-se.  An alternative might be to return without
clearing, as we are leaving the configset in a useless state either
way and give the caller a choice between clearing it and continue,
and dying without even spending unnecessary cycles to clear.  That
might be more consistent with what git_config_check_init() does,
where you die without bothering to clear what was half-read into the
configset.

Whatever behaviour is chosen in the error codepath, it needs to be
documented.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v9 1/2] add `config_set` API for caching config-like files

2014-07-15 Thread Tanay Abhra


On 7/15/2014 9:16 PM, Junio C Hamano wrote:
 Tanay Abhra tanay...@gmail.com writes:
 
 diff --git a/config.c b/config.c
 index ba882a1..89e2d67 100644
 --- a/config.c
 +++ b/config.c
 ...
 +const struct string_list *git_configset_get_value_multi(struct config_set 
 *cs, const char *key)
 +{
 +struct config_set_element *e = configset_find_element(cs, key);
 +return e ? e-value_list : NULL;
 +}
 
 Somehow I find the placement of this function out of place.  Didn't
 you get the same impression while proofreading the entire file after
 you are done writing this patch?
 
 The right place for it would probably be immediately before
 git_configset_get_value(), I would think.


Noted. My fault, will correct in the reroll.

 +int git_configset_add_file(struct config_set *cs, const char *filename)
 +{
 +int ret = 0;
 +ret = git_config_from_file(config_hash_callback, filename, cs);
 +if (!ret)
 +return 0;
 +else {
 +git_configset_clear(cs);
 +return -1;
 +}
 +}
 
 If I add contents from file A successfully and then attempt to
 further add contents from file B which fails, I would lose
 contents I read from A?
 
 It would not be worth trying to make it transactional (i.e. making
 sure cs contains what was there before the config-from-file was
 called if it failed), so in such a case we may end up seeing a
 mixture of full contents from A and partial contents from B, which
 is just as useless as a cleared configset, so it is not wrong to
 clear it per-se.  An alternative might be to return without
 clearing, as we are leaving the configset in a useless state either
 way and give the caller a choice between clearing it and continue,
 and dying without even spending unnecessary cycles to clear.  That
 might be more consistent with what git_config_check_init() does,
 where you die without bothering to clear what was half-read into the
 configset.
 
 Whatever behaviour is chosen in the error codepath, it needs to be
 documented.


Since syntactical errors in reading the file will cause a die when we
use either `git_config_from_file` or `git_config`, I don't think
incomplete parsing would trigger the error path, as it will die before
reaching there.

So, the best way would be just to return without clearing and let the
user take the decision if he wants to go on with the incomplete
config_set or clear it when he sees that configset_add_file() failed.

Die in git_config_check_init() is never triggered expect in rare race
conditions like between access_or_die() and git_config_from_file() in

if (git_config_system()  !access_or_die(git_etc_gitconfig(), R_OK, 
0)) {
ret += git_config_from_file(fn, git_etc_gitconfig(),
data);
found += 1;
}
which I think would never happen in reality, dunno. I think I will take
out the die() in git_config_check_init(). Thus, the behavior of both
functions git_config_check_init()  git_configset_add_file will be consistent.

Thanks.


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html