Re: [PATCH 1/3] name-hash: refactor polymorphic index_name_exists()

2013-09-13 Thread Eric Sunshine
On Fri, Sep 13, 2013 at 6:16 PM, Junio C Hamano  wrote:
> Eric Sunshine  writes:
>
>> Given the above. How should I proceed? Do you still feel that it is
>> advisable to keep an index_name_exists() around for compatibility
>> reasons in case any new callers are introduced? Regardless of that
>> answer, do you want index_name_exists() renamed to
>> index_file_exists()?
>
> Renaming *_name_exists() to *_file_exists() without keeping a
> compatibility one will force new topics to be rebased on this
> series.  Alternatively we could merge them to 'pu' (and later 'next'
> and 'master') with evil merges to adjust the change in the semantics
> of the called function.  That increases the risk of accidental
> breakages, I think.
>
> It is safer to keep index_name_exists() around with the older
> semantics, if we can, and rename your "file only" one to a different
> name.  That way, even if a new topic still uses index_name_exists()
> expecting the traditional behaviour, it will not break immediately
> and we do not need to risk evil merges making mistakes.
>
> Later, we can "git grep _name_exists" to spot them and convert such
> old-style calls to either "directory only" or "file only" variants
> after this series and these follow-on topics hit 'master' (and we do
> not know at this point in what order that happens).

Thanks. That's what I needed to know. I'll re-roll with the suggested changes.

(And, I'm looking into the Mac-only test breakages not related to this
patch series.)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] name-hash: refactor polymorphic index_name_exists()

2013-09-13 Thread Junio C Hamano
Eric Sunshine  writes:

> Given the above. How should I proceed? Do you still feel that it is
> advisable to keep an index_name_exists() around for compatibility
> reasons in case any new callers are introduced? Regardless of that
> answer, do you want index_name_exists() renamed to
> index_file_exists()?

Renaming *_name_exists() to *_file_exists() without keeping a
compatibility one will force new topics to be rebased on this
series.  Alternatively we could merge them to 'pu' (and later 'next'
and 'master') with evil merges to adjust the change in the semantics
of the called function.  That increases the risk of accidental
breakages, I think.

It is safer to keep index_name_exists() around with the older
semantics, if we can, and rename your "file only" one to a different
name.  That way, even if a new topic still uses index_name_exists()
expecting the traditional behaviour, it will not break immediately
and we do not need to risk evil merges making mistakes.

Later, we can "git grep _name_exists" to spot them and convert such
old-style calls to either "directory only" or "file only" variants
after this series and these follow-on topics hit 'master' (and we do
not know at this point in what order that happens).

Hmm?




--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] name-hash: refactor polymorphic index_name_exists()

2013-09-13 Thread Eric Sunshine
On Fri, Sep 13, 2013 at 3:41 PM, Junio C Hamano  wrote:
> On Fri, Sep 13, 2013 at 12:29 PM, Eric Sunshine  
> wrote:
>> On Fri, Sep 13, 2013 at 2:40 PM, Junio C Hamano  wrote:
>>> Eric Sunshine  writes:
>>>
 Since these two modes of operations are disjoint and have no code in
 common (one searches index_state.name_hash; the other dir_hash), they
 can be represented more naturally as distinct functions: one to search
 for a file, and one for a directory.
>>>
>>> Good thinking.  Thanks for working on this; I agree with the general
>>> direction this series is going.
>>>
>>> I however wonder if it would be a good idea to rename the other one
>>> to {cache|index}_file_exists(), and even keep the *_name_exists()
>>> variant that switches between the two based on the trailing slash,
>>> to avoid breaking other topics in flight that may have added new
>>> callsites to *_name_exists().  Otherwise the new callsites will feed
>>> a path with a trailing slash to *_name_exists() and get a false
>>> result, no?
>>
>> An assert("no trailing /") was added to index_name_exists() precisely
>> for the purpose of catching cases of code incorrectly calling
>> index_name_exists() to search for a directory. No existing code in
>> 'master' trips the assertion (at least according to the tests),
>> however, the assertion may be annoying for topics in flight which do.
>>
>> Other than plopping these patches atop 'pu' and seeing if anything
>> breaks, what would be the "git way" of detecting in-flight topics
>> which add callers of index_name_exists()? (Excuse my git ignorance.)
>
> I would do a quick
>
> $ git diff master..pu | grep '^+.*_name_exists'
>
> which would be more conservative (it would also show an invocation
> moved from one place to another).

There appear to be no topics in flight which add new
index_name_exists() callers (which is not to say that no new callers
will be introduced before this topic lands in 'master').

I also plopped the patches atop 'pu' and there are no new tests
failures on Linux or Mac OS X, so the series does not seem to break
anything in flight. (There are a few test failures on 'pu' on Mac OS
X, but they are not related to this series.)

Given the above. How should I proceed? Do you still feel that it is
advisable to keep an index_name_exists() around for compatibility
reasons in case any new callers are introduced? Regardless of that
answer, do you want index_name_exists() renamed to
index_file_exists()?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] name-hash: refactor polymorphic index_name_exists()

2013-09-13 Thread Eric Sunshine
On Fri, Sep 13, 2013 at 2:40 PM, Junio C Hamano  wrote:
> Eric Sunshine  writes:
>
>> Depending upon the absence or presence of a trailing '/' on the incoming
>> pathname, index_name_exists() checks either if a file is present in the
>> index or if a directory is represented within the index.  Each caller
>> explicitly chooses the mode of operation by adding or removing a
>> trailing '/' before invoking index_name_exists().
>>
>> Since these two modes of operations are disjoint and have no code in
>> common (one searches index_state.name_hash; the other dir_hash), they
>> can be represented more naturally as distinct functions: one to search
>> for a file, and one for a directory.
>>
>> Splitting index searching into two functions relieves callers of the
>> artificial burden of having to add or remove a slash to select the mode
>> of operation; instead they just call the desired function. A subsequent
>> patch will take advantage of this benefit in order to eliminate the
>> requirement that the incoming pathname for a directory search must have
>> a trailing slash.
>
> Good thinking.  Thanks for working on this; I agree with the general
> direction this series is going.
>
> I however wonder if it would be a good idea to rename the other one
> to {cache|index}_file_exists(), and even keep the *_name_exists()
> variant that switches between the two based on the trailing slash,
> to avoid breaking other topics in flight that may have added new
> callsites to *_name_exists().  Otherwise the new callsites will feed
> a path with a trailing slash to *_name_exists() and get a false
> result, no?

An assert("no trailing /") was added to index_name_exists() precisely
for the purpose of catching cases of code incorrectly calling
index_name_exists() to search for a directory. No existing code in
'master' trips the assertion (at least according to the tests),
however, the assertion may be annoying for topics in flight which do.

Other than plopping these patches atop 'pu' and seeing if anything
breaks, what would be the "git way" of detecting in-flight topics
which add callers of index_name_exists()? (Excuse my git ignorance.)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] name-hash: refactor polymorphic index_name_exists()

2013-09-13 Thread Junio C Hamano
Eric Sunshine  writes:

> Depending upon the absence or presence of a trailing '/' on the incoming
> pathname, index_name_exists() checks either if a file is present in the
> index or if a directory is represented within the index.  Each caller
> explicitly chooses the mode of operation by adding or removing a
> trailing '/' before invoking index_name_exists().
>
> Since these two modes of operations are disjoint and have no code in
> common (one searches index_state.name_hash; the other dir_hash), they
> can be represented more naturally as distinct functions: one to search
> for a file, and one for a directory.
>
> Splitting index searching into two functions relieves callers of the
> artificial burden of having to add or remove a slash to select the mode
> of operation; instead they just call the desired function. A subsequent
> patch will take advantage of this benefit in order to eliminate the
> requirement that the incoming pathname for a directory search must have
> a trailing slash.

Good thinking.  Thanks for working on this; I agree with the general
direction this series is going.

I however wonder if it would be a good idea to rename the other one
to {cache|index}_file_exists(), and even keep the *_name_exists()
variant that switches between the two based on the trailing slash,
to avoid breaking other topics in flight that may have added new
callsites to *_name_exists().  Otherwise the new callsites will feed
a path with a trailing slash to *_name_exists() and get a false
result, no?


> Signed-off-by: Eric Sunshine 
> ---
>  cache.h  |  2 ++
>  dir.c|  7 ---
>  name-hash.c  | 47 ---
>  read-cache.c |  2 +-
>  4 files changed, 31 insertions(+), 27 deletions(-)
>
> diff --git a/cache.h b/cache.h
> index 9ef778a..03ec24c 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -314,6 +314,7 @@ extern void free_name_hash(struct index_state *istate);
>  #define refresh_cache(flags) refresh_index(&the_index, (flags), NULL, NULL, 
> NULL)
>  #define ce_match_stat(ce, st, options) ie_match_stat(&the_index, (ce), (st), 
> (options))
>  #define ce_modified(ce, st, options) ie_modified(&the_index, (ce), (st), 
> (options))
> +#define cache_dir_exists(name, namelen) index_dir_exists(&the_index, (name), 
> (namelen))
>  #define cache_name_exists(name, namelen, igncase) 
> index_name_exists(&the_index, (name), (namelen), (igncase))
>  #define cache_name_is_other(name, namelen) index_name_is_other(&the_index, 
> (name), (namelen))
>  #define resolve_undo_clear() resolve_undo_clear_index(&the_index)
> @@ -463,6 +464,7 @@ extern int write_index(struct index_state *, int newfd);
>  extern int discard_index(struct index_state *);
>  extern int unmerged_index(const struct index_state *);
>  extern int verify_path(const char *path);
> +extern struct cache_entry *index_dir_exists(struct index_state *istate, 
> const char *name, int namelen);
>  extern struct cache_entry *index_name_exists(struct index_state *istate, 
> const char *name, int namelen, int igncase);
>  extern int index_name_pos(const struct index_state *, const char *name, int 
> namelen);
>  #define ADD_CACHE_OK_TO_ADD 1/* Ok to add */
> diff --git a/dir.c b/dir.c
> index b439ff0..0080673 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -860,7 +860,8 @@ static struct dir_entry *dir_entry_new(const char 
> *pathname, int len)
>  
>  static struct dir_entry *dir_add_name(struct dir_struct *dir, const char 
> *pathname, int len)
>  {
> - if (cache_name_exists(pathname, len, ignore_case))
> + if ((len == 0 || pathname[len - 1] != '/') &&
> + cache_name_exists(pathname, len, ignore_case))
>   return NULL;
>  
>   ALLOC_GROW(dir->entries, dir->nr+1, dir->alloc);
> @@ -885,11 +886,11 @@ enum exist_status {
>  /*
>   * Do not use the alphabetically sorted index to look up
>   * the directory name; instead, use the case insensitive
> - * name hash.
> + * directory hash.
>   */
>  static enum exist_status directory_exists_in_index_icase(const char 
> *dirname, int len)
>  {
> - const struct cache_entry *ce = cache_name_exists(dirname, len + 1, 
> ignore_case);
> + const struct cache_entry *ce = cache_dir_exists(dirname, len + 1);
>   unsigned char endchar;
>  
>   if (!ce)
> diff --git a/name-hash.c b/name-hash.c
> index 617c86c..5b01554 100644
> --- a/name-hash.c
> +++ b/name-hash.c
> @@ -222,11 +222,35 @@ static int same_name(const struct cache_entry *ce, 
> const char *name, int namelen
>   return slow_same_name(name, namelen, ce->name, len);
>  }
>  
> +struct cache_entry *index_dir_exists(struct index_state *istate, const char 
> *name, int namelen)
> +{
> + struct cache_entry *ce;
> + struct dir_entry *dir;
> +
> + lazy_init_name_hash(istate);
> + dir = find_dir_entry(istate, name, namelen);
> + if (dir && dir->nr)
> + return dir->ce;
> +
> + /*
> +  * It might be a submodule. Unlike plain directories, which are