Re: [PATCH v2 12/16] pathspec: create parse_long_magic function

2016-12-11 Thread Junio C Hamano
Stefan Beller  writes:

>> I do not see how it would work without further splitting the
>> attr_stack.  I think you made it per check[], but you would further
>> split it per  before losing the mutex, no?
>
> Well I have not yet looked into it again, so my memories are
> rusty, but the  is read only, such that the answers only
> need to be per thread?

 is read-only, so as long as you populate the singleton
attr's beforehand, they can be shared across threads.  
of course needs an instance per thread, and that is why you have
them typically on stack.

The process of filling  by asking for a set of attrs in
 for one  goes roughly like:

 - make sure attr_stack is set up for , namely, the
   info/attributes and .gitattributes files for each leading
   directory are parsed.

 - go over the attr_stack entries and see what entries match ,
   and collect the result for  in 

Before d90675c151 ("attr: keep attr stack for each check",
2016-11-10), I had only one instance of an attr stack [*1*], but
with that commit, you made it per , which is a good move to
allow us to optimize by keeping only the attributes relevant to
 on the attr stack.

But it does not solve the threading issue.

If multiple threads are asking for the same set of attrs
(i.e. running the same codepath using the same ) but for
s in different parts of the working tree (e.g. "git checkout"
that is multi-threaded, each thread asking for eol related
attributes and checking out different subdirectories), you'd need
mutex for correct operation at least, but that won't perform well
because you'd end up thrashing the attr stack.  You'd need to split
attr stack further and make it per (, thread) tuple and you
no longer need mutex at that point, but not before that.


[footnote]

*1* This is because the attr subsystem originally wasn't designed to
be used from multiple threads at the same time hence it was
sufficient to have a single "currently interested are of the
directory hierarchy".


Re: [PATCH v2 12/16] pathspec: create parse_long_magic function

2016-12-11 Thread Stefan Beller
On Sat, Dec 10, 2016 at 10:18 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> Jonathan Nieder mentioned off list that he prefers to see that
>> series rerolled without mutexes if possible. That is possible by
>> creating the questions "struct attr_check" before preloading the
>> index and then using the read only questions in the threaded code,
>> to obtain answers fast; also no need for a mutex.
>
> I do not see how it would work without further splitting the
> attr_stack.  I think you made it per check[], but you would further
> split it per  before losing the mutex, no?

Well I have not yet looked into it again, so my memories are
rusty, but the  is read only, such that the answers only
need to be per thread?


Re: [PATCH v2 12/16] pathspec: create parse_long_magic function

2016-12-10 Thread Junio C Hamano
Stefan Beller  writes:

> Jonathan Nieder mentioned off list that he prefers to see that
> series rerolled without mutexes if possible. That is possible by
> creating the questions "struct attr_check" before preloading the
> index and then using the read only questions in the threaded code,
> to obtain answers fast; also no need for a mutex.

I do not see how it would work without further splitting the
attr_stack.  I think you made it per check[], but you would further
split it per  before losing the mutex, no?


Re: [PATCH v2 12/16] pathspec: create parse_long_magic function

2016-12-09 Thread Brandon Williams
On 12/09, Stefan Beller wrote:
> On Fri, Dec 9, 2016 at 3:44 PM, Junio C Hamano  wrote:
> > Brandon Williams  writes:
> >
> >> Factor out the logic responsible for parsing long magic into its own
> >> function.  As well as hoist the prefix check logic outside of the inner
> >> loop as there isn't anything that needs to be done after matching
> >> "prefix:".
> >>
> >> Signed-off-by: Brandon Williams 
> >
> > These refactoring changes look like they are all going in the good
> > direction.  Stefan's :(attr:)path" changes however
> > have severe conflicts (e.g. the topic already does something similar
> > to this step and calls the factored-out function eat_long_magic()).
> >
> > My gut feeling is that we probably should ask Stefan's series to be
> > rebased on top of this series that cleans up pathspec implementation,
> > once it stabilizes.
> 
> Very much so.
> 
> Jonathan Nieder mentioned off list that he prefers to see that
> series rerolled without mutexes if possible. That is possible by
> creating the questions "struct attr_check" before preloading the
> index and then using the read only questions in the threaded code,
> to obtain answers fast; also no need for a mutex.
> 
> I did not look into that yet, though. So I think you could discard that
> series (again) until I find time to either redo the series or
> resend it with a proper explanation on why the approach above
> is not feasible.
> 
> >  We could probably go the other way around, but
> > logically it makes more sense to build "pathspec can also match
> > using attributes information" on top of a refactored codebase.
> >
> > Thoughts?
> 
> Please let the refactoring in in favor of the attr series.

Sounds good.  I only looked at your series briefly, but I'm hoping that
these refactoring changes I'm proposing make it relatively easy for you
to build on top of in the future.

-- 
Brandon Williams


Re: [PATCH v2 12/16] pathspec: create parse_long_magic function

2016-12-09 Thread Stefan Beller
On Fri, Dec 9, 2016 at 3:44 PM, Junio C Hamano  wrote:
> Brandon Williams  writes:
>
>> Factor out the logic responsible for parsing long magic into its own
>> function.  As well as hoist the prefix check logic outside of the inner
>> loop as there isn't anything that needs to be done after matching
>> "prefix:".
>>
>> Signed-off-by: Brandon Williams 
>
> These refactoring changes look like they are all going in the good
> direction.  Stefan's :(attr:)path" changes however
> have severe conflicts (e.g. the topic already does something similar
> to this step and calls the factored-out function eat_long_magic()).
>
> My gut feeling is that we probably should ask Stefan's series to be
> rebased on top of this series that cleans up pathspec implementation,
> once it stabilizes.

Very much so.

Jonathan Nieder mentioned off list that he prefers to see that
series rerolled without mutexes if possible. That is possible by
creating the questions "struct attr_check" before preloading the
index and then using the read only questions in the threaded code,
to obtain answers fast; also no need for a mutex.

I did not look into that yet, though. So I think you could discard that
series (again) until I find time to either redo the series or
resend it with a proper explanation on why the approach above
is not feasible.

>  We could probably go the other way around, but
> logically it makes more sense to build "pathspec can also match
> using attributes information" on top of a refactored codebase.
>
> Thoughts?

Please let the refactoring in in favor of the attr series.


Re: [PATCH v2 12/16] pathspec: create parse_long_magic function

2016-12-09 Thread Junio C Hamano
Brandon Williams  writes:

> Factor out the logic responsible for parsing long magic into its own
> function.  As well as hoist the prefix check logic outside of the inner
> loop as there isn't anything that needs to be done after matching
> "prefix:".
>
> Signed-off-by: Brandon Williams 

These refactoring changes look like they are all going in the good
direction.  Stefan's :(attr:)path" changes however
have severe conflicts (e.g. the topic already does something similar
to this step and calls the factored-out function eat_long_magic()).

My gut feeling is that we probably should ask Stefan's series to be
rebased on top of this series that cleans up pathspec implementation,
once it stabilizes.  We could probably go the other way around, but
logically it makes more sense to build "pathspec can also match
using attributes information" on top of a refactored codebase.

Thoughts?


[PATCH v2 12/16] pathspec: create parse_long_magic function

2016-12-08 Thread Brandon Williams
Factor out the logic responsible for parsing long magic into its own
function.  As well as hoist the prefix check logic outside of the inner
loop as there isn't anything that needs to be done after matching
"prefix:".

Signed-off-by: Brandon Williams 
---
 pathspec.c | 92 ++
 1 file changed, 57 insertions(+), 35 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index 29054d2..6e9555e 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -157,6 +157,60 @@ static int get_global_magic(int element_magic)
 }
 
 /*
+ * Parse the pathspec element looking for long magic
+ *
+ * saves all magic in 'magic'
+ * if prefix magic is used, save the prefix length in 'prefix_len'
+ * returns the position in 'elem' after all magic has been parsed
+ */
+static const char *parse_long_magic(unsigned *magic, int *prefix_len,
+   const char *elem)
+{
+   const char *pos;
+   const char *nextat;
+
+   for (pos = elem + 2; *pos && *pos != ')'; pos = nextat) {
+   size_t len = strcspn(pos, ",)");
+   int i;
+
+   if (pos[len] == ',')
+   nextat = pos + len + 1; /* handle ',' */
+   else
+   nextat = pos + len; /* handle ')' and '\0' */
+
+   if (!len)
+   continue;
+
+   if (starts_with(pos, "prefix:")) {
+   char *endptr;
+   *prefix_len = strtol(pos + 7, , 10);
+   if (endptr - pos != len)
+   die(_("invalid parameter for pathspec magic 
'prefix'"));
+   continue;
+   }
+
+   for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
+   if (strlen(pathspec_magic[i].name) == len &&
+   !strncmp(pathspec_magic[i].name, pos, len)) {
+   *magic |= pathspec_magic[i].bit;
+   break;
+   }
+   }
+
+   if (ARRAY_SIZE(pathspec_magic) <= i)
+   die(_("Invalid pathspec magic '%.*s' in '%s'"),
+   (int) len, pos, elem);
+   }
+
+   if (*pos != ')')
+   die(_("Missing ')' at the end of pathspec magic in '%s'"),
+   elem);
+   pos++;
+
+   return pos;
+}
+
+/*
  * Parse the pathspec element looking for short magic
  *
  * saves all magic in 'magic'
@@ -218,41 +272,9 @@ static unsigned prefix_pathspec(struct pathspec_item 
*item, unsigned flags,
; /* nothing to do */
} else if (elt[1] == '(') {
/* longhand */
-   const char *nextat;
-   for (copyfrom = elt + 2;
-*copyfrom && *copyfrom != ')';
-copyfrom = nextat) {
-   size_t len = strcspn(copyfrom, ",)");
-   if (copyfrom[len] == ',')
-   nextat = copyfrom + len + 1;
-   else
-   /* handle ')' and '\0' */
-   nextat = copyfrom + len;
-   if (!len)
-   continue;
-   for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
-   if (strlen(pathspec_magic[i].name) == len &&
-   !strncmp(pathspec_magic[i].name, copyfrom, 
len)) {
-   element_magic |= pathspec_magic[i].bit;
-   break;
-   }
-   if (starts_with(copyfrom, "prefix:")) {
-   char *endptr;
-   pathspec_prefix = strtol(copyfrom + 7,
-, 10);
-   if (endptr - copyfrom != len)
-   die(_("invalid parameter for 
pathspec magic 'prefix'"));
-   /* "i" would be wrong, but it does not 
matter */
-   break;
-   }
-   }
-   if (ARRAY_SIZE(pathspec_magic) <= i)
-   die(_("Invalid pathspec magic '%.*s' in '%s'"),
-   (int) len, copyfrom, elt);
-   }
-   if (*copyfrom != ')')
-   die(_("Missing ')' at the end of pathspec magic in 
'%s'"), elt);
-   copyfrom++;
+   copyfrom = parse_long_magic(_magic,
+   _prefix,
+   elt);
} else {
/* shorthand */
copyfrom =