Re: [PATCH 8/8] pathspec: convert parse_pathspec to take an index

2017-05-11 Thread Junio C Hamano
Brandon Williams  writes:

> ...  Note that if we go with the route to not pass
> in an index now, it doesn't necessarily mean that down the line we won't
> have to pass a 'repository' instance into parse_pathspec().

Correct.  

An attribute annotated pathspec may want to check if the attributes
used in it are used in the repository at all for validation or
optimization purposes, for example.


Re: [PATCH 8/8] pathspec: convert parse_pathspec to take an index

2017-05-11 Thread Brandon Williams
On 05/11, Junio C Hamano wrote:
> Brandon Williams  writes:
> 
> > ls-files is the only command (that I know of) which does cache pruning
> > based on the common prefix of all pathspecs given.  If there is a
> > submodule named 'sub' and you provided a pathspec 'sub/', the matching
> > logic can handle this but the cache pruning logic would prune all
> > entries from the index which don't have a leading 'sub/' which is the
> > common prefix of all pathspecs (if you didn't strip off the slash).
> > Meaning you'd prune the submodule 'sub' when  you really didn't want to.
> > This could probably be fixed to have the cache pruning logic to prune by
> > ignoring the trailing slash always.
> >
> > So there's another option here if you don't feel quite right about
> > piping through an index into parse_pathspec just yet.
> 
> Oh, don't get me wrong---I do not think passing an index_state
> instance throughout the callchain (and perhaps later we may pass an
> instance of "repository" instead) is a wrong move in the longer term
> for various APIs.  I was just wondering if we have callers that can
> benefit from this change immediately---manipulators like "commit" do
> already use multiple instances of index_state structure.

I didn't get the feeling you thought it was a bad change.  I really
appreciate your thoughts since they are things which I also was
wondering about.

> But perhaps you are right---it may be wrong for the contents of the
> current index (or any index) to affect how a pathspec element is
> parsed in the first place.  If the current code (before this series)
> uses the_index only for error checking, we may want to separate that
> out of the parse_pathspec() callchain, so that it does not even look
> at any index (not just the_index).  And that may be a better change
> overall.

I'll polish up the other version of the series which does the processing
(using an index) outside of parse_pathspec() and let you see how you
feel about those changes.  Note that if we go with the route to not pass
in an index now, it doesn't necessarily mean that down the line we won't
have to pass a 'repository' instance into parse_pathspec().  Just food
for thought.

-- 
Brandon Williams


Re: [PATCH 8/8] pathspec: convert parse_pathspec to take an index

2017-05-10 Thread Junio C Hamano
Johannes Sixt  writes:

> Am 11.05.2017 um 03:48 schrieb Junio C Hamano:
>> But perhaps you are right---it may be wrong for the contents of the
>> current index (or any index) to affect how a pathspec element is
>> parsed in the first place.  If the current code (before this series)
>> uses the_index only for error checking, we may want to separate that
>> out of the parse_pathspec() callchain, so that it does not even look
>> at any index (not just the_index).  And that may be a better change
>> overall.
>
> Just a reminder: if core.ignoreCase is set, the variant of a path in
> the index takes precedence over the variant found in the working
> tree. Hence, pathspec must be matched against the index that
> corresponds to the working tree. I guess that's the_index.

Yes, but that is what happens after a path from a working tree is
found to match a pathspec, to coerse its case into the one that is
in the current index.  The parse_pathspec() thing we are discussing
would have finished long time before that actual "matching" is
attempted.

Thanks.


Re: [PATCH 8/8] pathspec: convert parse_pathspec to take an index

2017-05-10 Thread Johannes Sixt

Am 11.05.2017 um 03:48 schrieb Junio C Hamano:

But perhaps you are right---it may be wrong for the contents of the
current index (or any index) to affect how a pathspec element is
parsed in the first place.  If the current code (before this series)
uses the_index only for error checking, we may want to separate that
out of the parse_pathspec() callchain, so that it does not even look
at any index (not just the_index).  And that may be a better change
overall.


Just a reminder: if core.ignoreCase is set, the variant of a path in the 
index takes precedence over the variant found in the working tree. 
Hence, pathspec must be matched against the index that corresponds to 
the working tree. I guess that's the_index.


-- Hannes


Re: [PATCH 8/8] pathspec: convert parse_pathspec to take an index

2017-05-10 Thread Junio C Hamano
Brandon Williams  writes:

> ls-files is the only command (that I know of) which does cache pruning
> based on the common prefix of all pathspecs given.  If there is a
> submodule named 'sub' and you provided a pathspec 'sub/', the matching
> logic can handle this but the cache pruning logic would prune all
> entries from the index which don't have a leading 'sub/' which is the
> common prefix of all pathspecs (if you didn't strip off the slash).
> Meaning you'd prune the submodule 'sub' when  you really didn't want to.
> This could probably be fixed to have the cache pruning logic to prune by
> ignoring the trailing slash always.
>
> So there's another option here if you don't feel quite right about
> piping through an index into parse_pathspec just yet.

Oh, don't get me wrong---I do not think passing an index_state
instance throughout the callchain (and perhaps later we may pass an
instance of "repository" instead) is a wrong move in the longer term
for various APIs.  I was just wondering if we have callers that can
benefit from this change immediately---manipulators like "commit" do
already use multiple instances of index_state structure.

But perhaps you are right---it may be wrong for the contents of the
current index (or any index) to affect how a pathspec element is
parsed in the first place.  If the current code (before this series)
uses the_index only for error checking, we may want to separate that
out of the parse_pathspec() callchain, so that it does not even look
at any index (not just the_index).  And that may be a better change
overall.

Thanks.



Re: [PATCH 8/8] pathspec: convert parse_pathspec to take an index

2017-05-10 Thread Brandon Williams
On 05/10, Junio C Hamano wrote:
> Brandon Williams  writes:
> 
> > Convert 'parse_pathspec()' to take an index parameter.
> >
> > Since the index is only needed when the PATHSPEC_SUBMODULE_LEADING_PATH
> > and PATHSPEC_STRIP_SUBMODULE_SLASH flags are given, add a check
> > requiring a non-NULL index when either of these flags are given.
> > Convert callers which use these two flags to pass in an index while
> > having other callers pass in NULL.
> >
> > Now that pathspec.c does not use any cache macros and has no references
> > to 'the_index', mark it by defining NO_THE_INDEX_COMPATIBILITY_MACROS.
> >
> > Signed-off-by: Brandon Williams 
> 
> The same comment as 5/8 applies to this change, but it is a bit
> easier to judge, because it has so many callers, and for some
> builtins, especially manipulator commands like add, checkout, and
> commit, there may be a good reason why they want to keep the primary
> index while playing with an additional in-core index in a distant
> future.
> 
> Does a pathspec parsed using one instance of index_state expected to
> work when matching against a path in another instance of index_state?
> Otherwise, passing a non-NULL istate to parse_pathspec() would tie
> the resulting pathspec to a particular index_state in some way and
> there may need a mechanism to catch an attempt to match paths in
> another index_state with such a pathspec as an error.  Just
> speculating out loud...
> 

Currently I don't believe this links a pathspec with a particular
index_state since an index is really just used to do some pre-processing
on the paths (check if the path goes into a submodule and die, or strip
a slash if the path matches a submodule), though I can see a future where
this is true.

I did have another version of this series where I completely removed the
two flags related to submodules.  Since builtin/add is the only caller
which needs to die if a path descends into a submodule, I had a function
which did this after the parse_pathspec() call.  Also, since (ae8d08242
pathspec: pass directory indicator to match_pathspec_item()) stripping
off the slash from a submodule path really is no longer needed for the
path matching logic, this means that we could potentially just drop the
strip slash flag.  The only caveat is ls-files.

ls-files is the only command (that I know of) which does cache pruning
based on the common prefix of all pathspecs given.  If there is a
submodule named 'sub' and you provided a pathspec 'sub/', the matching
logic can handle this but the cache pruning logic would prune all
entries from the index which don't have a leading 'sub/' which is the
common prefix of all pathspecs (if you didn't strip off the slash).
Meaning you'd prune the submodule 'sub' when  you really didn't want to.
This could probably be fixed to have the cache pruning logic to prune by
ignoring the trailing slash always.

So there's another option here if you don't feel quite right about
piping through an index into parse_pathspec just yet.

-- 
Brandon Williams


Re: [PATCH 8/8] pathspec: convert parse_pathspec to take an index

2017-05-10 Thread Junio C Hamano
Brandon Williams  writes:

> Convert 'parse_pathspec()' to take an index parameter.
>
> Since the index is only needed when the PATHSPEC_SUBMODULE_LEADING_PATH
> and PATHSPEC_STRIP_SUBMODULE_SLASH flags are given, add a check
> requiring a non-NULL index when either of these flags are given.
> Convert callers which use these two flags to pass in an index while
> having other callers pass in NULL.
>
> Now that pathspec.c does not use any cache macros and has no references
> to 'the_index', mark it by defining NO_THE_INDEX_COMPATIBILITY_MACROS.
>
> Signed-off-by: Brandon Williams 

The same comment as 5/8 applies to this change, but it is a bit
easier to judge, because it has so many callers, and for some
builtins, especially manipulator commands like add, checkout, and
commit, there may be a good reason why they want to keep the primary
index while playing with an additional in-core index in a distant
future.

Does a pathspec parsed using one instance of index_state expected to
work when matching against a path in another instance of index_state?
Otherwise, passing a non-NULL istate to parse_pathspec() would tie
the resulting pathspec to a particular index_state in some way and
there may need a mechanism to catch an attempt to match paths in
another index_state with such a pathspec as an error.  Just
speculating out loud...