Re: [PATCH 28/36] attr: keep attr stack for each check

2016-10-26 Thread Stefan Beller
On Sun, Oct 23, 2016 at 8:10 AM, Ramsay Jones
 wrote:
>> +
>> +struct hashmap all_attr_stacks;
>> +int all_attr_stacks_init;
>
> Mark symbols 'all_attr_stacks' and 'all_attr_stacks_init' with
> the static keyword. (ie. these are file-local symbols).
>
> ATB,
> Ramsay Jones
>

This fix will appear in a resend of the series.

Thanks,
Stefan


Re: [PATCH 28/36] attr: keep attr stack for each check

2016-10-24 Thread Junio C Hamano
Stefan Beller  writes:

> I looked for a platform independent way to get a thread id as a natural
> number, i.e. I want to get 1,2,3,... such that I could have just added
> list/array of attr stacks to each check, which would be the
>  tuple you envision.
>
> However I think we do not really need it to be per check.  If we had
> an easy portable way of getting such a thread id, I would have implemented
> a list of stacks per thread first. (Because each thread only looks at one
> check at a time.)

It seems that by "list of stacks per thread", you mean "there is a
list of stacks, each thread uses one and only element of that list",
but I do not think it would be desirable.

"Each thread only looks at one check at a time" is false.  For
example, "write_archive_entry()" would use one check that is
specific to "git archive" to learn about "export-ignore" and
"export-subst" attributes, while letting convert_to_write_tree()
called via sha1_file_to_archive() called via write_entry() method
(i.e. write_tar_entry() or write_zip_entry()) to use a separate
check that is specific to the convert.c API.

>> With manipulation of attr stack protected with a single Big
>> Attributes Lock, this should be safe.  It may not perform very well
>> when used by multiple threads, though ;-)
>
> I agree. So maybe it is not really a good fit for general consumption yet.

I would still think it is a good first step.  It may already be
thread-safe, but may not be thread-ready from performance point of
view.  IOW, this would not yet help an attempt to make the callers
faster by making them multi-threaded.



Re: [PATCH 28/36] attr: keep attr stack for each check

2016-10-24 Thread Stefan Beller
On Mon, Oct 24, 2016 at 12:07 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> Instead of having a global attr stack, attach the stack to each check.
>
> Two threads may be working on "git checkout", one "git_attr_check"
> in convert.c may be used by them to learn the EOL conversion for
> each path, and these threads are working in different parts of
> worktree in parallel.  The set of .gitattributes files each of these
> threads wants to be cached at one time is tied to where in the
> directory hierarchy the thread is working in.
>
> The view by API users would not have to change from the point on
> since 27/36 or so, but I think attr_stack needs to become per
>  tuple when we are fully thread-ready for the above
> reason.

I looked for a platform independent way to get a thread id as a natural
number, i.e. I want to get 1,2,3,... such that I could have just added
list/array of attr stacks to each check, which would be the
 tuple you envision.

However I think we do not really need it to be per check.  If we had
an easy portable way of getting such a thread id, I would have implemented
a list of stacks per thread first. (Because each thread only looks at one
check at a time.)

So this is not a baby step because I did not want to do it all at once, but
because I did not find a suitable API to use.

>
> But we need to start somewhere to move away from the current "one
> single attr stack" to "there are multiple attr stacks", and this
> "two checks may and do use different attr stacks" is probably a
> reasonable first step.  It may give a single-threaded API users
> immediate benefit if the "read and keep only the entries relevant
> to the query" optimization is done with this step alone, without
> making the cache per  pair.
>
>> This allows to use the attr in a multithreaded way.
>
> With manipulation of attr stack protected with a single Big
> Attributes Lock, this should be safe.  It may not perform very well
> when used by multiple threads, though ;-)

I agree. So maybe it is not really a good fit for general consumption yet.


Re: [PATCH 28/36] attr: keep attr stack for each check

2016-10-24 Thread Junio C Hamano
Stefan Beller  writes:

> Instead of having a global attr stack, attach the stack to each check.

Two threads may be working on "git checkout", one "git_attr_check"
in convert.c may be used by them to learn the EOL conversion for
each path, and these threads are working in different parts of
worktree in parallel.  The set of .gitattributes files each of these
threads wants to be cached at one time is tied to where in the
directory hierarchy the thread is working in.

The view by API users would not have to change from the point on
since 27/36 or so, but I think attr_stack needs to become per
 tuple when we are fully thread-ready for the above
reason.

But we need to start somewhere to move away from the current "one
single attr stack" to "there are multiple attr stacks", and this
"two checks may and do use different attr stacks" is probably a
reasonable first step.  It may give a single-threaded API users
immediate benefit if the "read and keep only the entries relevant
to the query" optimization is done with this step alone, without
making the cache per  pair.

> This allows to use the attr in a multithreaded way.

With manipulation of attr stack protected with a single Big
Attributes Lock, this should be safe.  It may not perform very well
when used by multiple threads, though ;-)



Re: [PATCH 28/36] attr: keep attr stack for each check

2016-10-23 Thread Ramsay Jones


On 23/10/16 00:32, Stefan Beller wrote:
> Instead of having a global attr stack, attach the stack to each check.
> This allows to use the attr in a multithreaded way.
> 
> 
> 
> Signed-off-by: Stefan Beller 
> ---
>  attr.c| 101 
> +++---
>  attr.h|   4 ++-
>  hashmap.h |   2 ++
>  3 files changed, 69 insertions(+), 38 deletions(-)
> 
> diff --git a/attr.c b/attr.c
> index 89ae155..b65437d 100644
> --- a/attr.c
> +++ b/attr.c
> @@ -372,15 +372,17 @@ static struct match_attr *parse_attr_line(const char 
> *line, const char *src,
>   * .gitignore file and info/excludes file as a fallback.
>   */
>  
> -/* NEEDSWORK: This will become per git_attr_check */
> -static struct attr_stack {
> +struct attr_stack {
>   struct attr_stack *prev;
>   char *origin;
>   size_t originlen;
>   unsigned num_matches;
>   unsigned alloc;
>   struct match_attr **attrs;
> -} *attr_stack;
> +};
> +
> +struct hashmap all_attr_stacks;
> +int all_attr_stacks_init;

Mark symbols 'all_attr_stacks' and 'all_attr_stacks_init' with
the static keyword. (ie. these are file-local symbols).

ATB,
Ramsay Jones



[PATCH 28/36] attr: keep attr stack for each check

2016-10-22 Thread Stefan Beller
Instead of having a global attr stack, attach the stack to each check.
This allows to use the attr in a multithreaded way.



Signed-off-by: Stefan Beller 
---
 attr.c| 101 +++---
 attr.h|   4 ++-
 hashmap.h |   2 ++
 3 files changed, 69 insertions(+), 38 deletions(-)

diff --git a/attr.c b/attr.c
index 89ae155..b65437d 100644
--- a/attr.c
+++ b/attr.c
@@ -372,15 +372,17 @@ static struct match_attr *parse_attr_line(const char 
*line, const char *src,
  * .gitignore file and info/excludes file as a fallback.
  */
 
-/* NEEDSWORK: This will become per git_attr_check */
-static struct attr_stack {
+struct attr_stack {
struct attr_stack *prev;
char *origin;
size_t originlen;
unsigned num_matches;
unsigned alloc;
struct match_attr **attrs;
-} *attr_stack;
+};
+
+struct hashmap all_attr_stacks;
+int all_attr_stacks_init;
 
 static void free_attr_elem(struct attr_stack *e)
 {
@@ -561,11 +563,23 @@ static void debug_set(const char *what, const char 
*match, struct git_attr *attr
 
 static void drop_attr_stack(void)
 {
-   while (attr_stack) {
-   struct attr_stack *elem = attr_stack;
-   attr_stack = elem->prev;
-   free_attr_elem(elem);
+   struct hashmap_iter iter;
+   struct git_attr_check *check;
+
+   attr_lock();
+   if (!all_attr_stacks_init) {
+   attr_unlock();
+   return;
}
+   hashmap_iter_init(_attr_stacks, );
+   while ((check = hashmap_iter_next())) {
+   while (check->attr_stack) {
+   struct attr_stack *elem = check->attr_stack;
+   check->attr_stack = elem->prev;
+   free_attr_elem(elem);
+   }
+   }
+   attr_unlock();
 }
 
 static const char *git_etc_gitattributes(void)
@@ -595,40 +609,42 @@ static void push_stack(struct attr_stack **attr_stack_p,
}
 }
 
-static void bootstrap_attr_stack(void)
+static void bootstrap_attr_stack(struct git_attr_check *check)
 {
struct attr_stack *elem;
 
-   if (attr_stack)
+   if (check->attr_stack)
return;
 
-   push_stack(_stack, read_attr_from_array(builtin_attr), NULL, 0);
+   push_stack(>attr_stack,
+  read_attr_from_array(builtin_attr), NULL, 0);
 
if (git_attr_system())
-   push_stack(_stack,
+   push_stack(>attr_stack,
   read_attr_from_file(git_etc_gitattributes(), 1),
   NULL, 0);
 
if (!git_attributes_file)
git_attributes_file = xdg_config_home("attributes");
if (git_attributes_file)
-   push_stack(_stack,
+   push_stack(>attr_stack,
   read_attr_from_file(git_attributes_file, 1),
   NULL, 0);
 
if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
elem = read_attr(GITATTRIBUTES_FILE, 1);
-   push_stack(_stack, elem, xstrdup(""), 0);
+   push_stack(>attr_stack, elem, xstrdup(""), 0);
debug_push(elem);
}
 
elem = read_attr_from_file(git_path_info_attributes(), 1);
if (!elem)
elem = xcalloc(1, sizeof(*elem));
-   push_stack(_stack, elem, NULL, 0);
+   push_stack(>attr_stack, elem, NULL, 0);
 }
 
-static void prepare_attr_stack(const char *path, int dirlen)
+static void prepare_attr_stack(const char *path, int dirlen,
+  struct git_attr_check *check)
 {
struct attr_stack *elem, *info;
const char *cp;
@@ -648,13 +664,13 @@ static void prepare_attr_stack(const char *path, int 
dirlen)
 * .gitattributes in deeper directories to shallower ones,
 * and finally use the built-in set as the default.
 */
-   bootstrap_attr_stack();
+   bootstrap_attr_stack(check);
 
/*
 * Pop the "info" one that is always at the top of the stack.
 */
-   info = attr_stack;
-   attr_stack = info->prev;
+   info = check->attr_stack;
+   check->attr_stack = info->prev;
 
/*
 * Pop the ones from directories that are not the prefix of
@@ -662,17 +678,17 @@ static void prepare_attr_stack(const char *path, int 
dirlen)
 * the root one (whose origin is an empty string "") or the builtin
 * one (whose origin is NULL) without popping it.
 */
-   while (attr_stack->origin) {
-   int namelen = strlen(attr_stack->origin);
+   while (check->attr_stack->origin) {
+   int namelen = strlen(check->attr_stack->origin);
 
-   elem = attr_stack;
+   elem = check->attr_stack;
if (namelen <= dirlen &&
!strncmp(elem->origin, path, namelen) &&
(!namelen || path[namelen]