Re: [PATCH 1/3] name-hash: refactor polymorphic index_name_exists()
On Fri, Sep 13, 2013 at 6:16 PM, Junio C Hamano wrote: > Eric Sunshine writes: > >> Given the above. How should I proceed? Do you still feel that it is >> advisable to keep an index_name_exists() around for compatibility >> reasons in case any new callers are introduced? Regardless of that >> answer, do you want index_name_exists() renamed to >> index_file_exists()? > > Renaming *_name_exists() to *_file_exists() without keeping a > compatibility one will force new topics to be rebased on this > series. Alternatively we could merge them to 'pu' (and later 'next' > and 'master') with evil merges to adjust the change in the semantics > of the called function. That increases the risk of accidental > breakages, I think. > > It is safer to keep index_name_exists() around with the older > semantics, if we can, and rename your "file only" one to a different > name. That way, even if a new topic still uses index_name_exists() > expecting the traditional behaviour, it will not break immediately > and we do not need to risk evil merges making mistakes. > > Later, we can "git grep _name_exists" to spot them and convert such > old-style calls to either "directory only" or "file only" variants > after this series and these follow-on topics hit 'master' (and we do > not know at this point in what order that happens). Thanks. That's what I needed to know. I'll re-roll with the suggested changes. (And, I'm looking into the Mac-only test breakages not related to this patch series.) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] name-hash: refactor polymorphic index_name_exists()
Eric Sunshine writes: > Given the above. How should I proceed? Do you still feel that it is > advisable to keep an index_name_exists() around for compatibility > reasons in case any new callers are introduced? Regardless of that > answer, do you want index_name_exists() renamed to > index_file_exists()? Renaming *_name_exists() to *_file_exists() without keeping a compatibility one will force new topics to be rebased on this series. Alternatively we could merge them to 'pu' (and later 'next' and 'master') with evil merges to adjust the change in the semantics of the called function. That increases the risk of accidental breakages, I think. It is safer to keep index_name_exists() around with the older semantics, if we can, and rename your "file only" one to a different name. That way, even if a new topic still uses index_name_exists() expecting the traditional behaviour, it will not break immediately and we do not need to risk evil merges making mistakes. Later, we can "git grep _name_exists" to spot them and convert such old-style calls to either "directory only" or "file only" variants after this series and these follow-on topics hit 'master' (and we do not know at this point in what order that happens). Hmm? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] name-hash: refactor polymorphic index_name_exists()
On Fri, Sep 13, 2013 at 3:41 PM, Junio C Hamano wrote: > On Fri, Sep 13, 2013 at 12:29 PM, Eric Sunshine > wrote: >> On Fri, Sep 13, 2013 at 2:40 PM, Junio C Hamano wrote: >>> Eric Sunshine writes: >>> Since these two modes of operations are disjoint and have no code in common (one searches index_state.name_hash; the other dir_hash), they can be represented more naturally as distinct functions: one to search for a file, and one for a directory. >>> >>> Good thinking. Thanks for working on this; I agree with the general >>> direction this series is going. >>> >>> I however wonder if it would be a good idea to rename the other one >>> to {cache|index}_file_exists(), and even keep the *_name_exists() >>> variant that switches between the two based on the trailing slash, >>> to avoid breaking other topics in flight that may have added new >>> callsites to *_name_exists(). Otherwise the new callsites will feed >>> a path with a trailing slash to *_name_exists() and get a false >>> result, no? >> >> An assert("no trailing /") was added to index_name_exists() precisely >> for the purpose of catching cases of code incorrectly calling >> index_name_exists() to search for a directory. No existing code in >> 'master' trips the assertion (at least according to the tests), >> however, the assertion may be annoying for topics in flight which do. >> >> Other than plopping these patches atop 'pu' and seeing if anything >> breaks, what would be the "git way" of detecting in-flight topics >> which add callers of index_name_exists()? (Excuse my git ignorance.) > > I would do a quick > > $ git diff master..pu | grep '^+.*_name_exists' > > which would be more conservative (it would also show an invocation > moved from one place to another). There appear to be no topics in flight which add new index_name_exists() callers (which is not to say that no new callers will be introduced before this topic lands in 'master'). I also plopped the patches atop 'pu' and there are no new tests failures on Linux or Mac OS X, so the series does not seem to break anything in flight. (There are a few test failures on 'pu' on Mac OS X, but they are not related to this series.) Given the above. How should I proceed? Do you still feel that it is advisable to keep an index_name_exists() around for compatibility reasons in case any new callers are introduced? Regardless of that answer, do you want index_name_exists() renamed to index_file_exists()? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] name-hash: refactor polymorphic index_name_exists()
On Fri, Sep 13, 2013 at 2:40 PM, Junio C Hamano wrote: > Eric Sunshine writes: > >> Depending upon the absence or presence of a trailing '/' on the incoming >> pathname, index_name_exists() checks either if a file is present in the >> index or if a directory is represented within the index. Each caller >> explicitly chooses the mode of operation by adding or removing a >> trailing '/' before invoking index_name_exists(). >> >> Since these two modes of operations are disjoint and have no code in >> common (one searches index_state.name_hash; the other dir_hash), they >> can be represented more naturally as distinct functions: one to search >> for a file, and one for a directory. >> >> Splitting index searching into two functions relieves callers of the >> artificial burden of having to add or remove a slash to select the mode >> of operation; instead they just call the desired function. A subsequent >> patch will take advantage of this benefit in order to eliminate the >> requirement that the incoming pathname for a directory search must have >> a trailing slash. > > Good thinking. Thanks for working on this; I agree with the general > direction this series is going. > > I however wonder if it would be a good idea to rename the other one > to {cache|index}_file_exists(), and even keep the *_name_exists() > variant that switches between the two based on the trailing slash, > to avoid breaking other topics in flight that may have added new > callsites to *_name_exists(). Otherwise the new callsites will feed > a path with a trailing slash to *_name_exists() and get a false > result, no? An assert("no trailing /") was added to index_name_exists() precisely for the purpose of catching cases of code incorrectly calling index_name_exists() to search for a directory. No existing code in 'master' trips the assertion (at least according to the tests), however, the assertion may be annoying for topics in flight which do. Other than plopping these patches atop 'pu' and seeing if anything breaks, what would be the "git way" of detecting in-flight topics which add callers of index_name_exists()? (Excuse my git ignorance.) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] name-hash: refactor polymorphic index_name_exists()
Eric Sunshine writes: > Depending upon the absence or presence of a trailing '/' on the incoming > pathname, index_name_exists() checks either if a file is present in the > index or if a directory is represented within the index. Each caller > explicitly chooses the mode of operation by adding or removing a > trailing '/' before invoking index_name_exists(). > > Since these two modes of operations are disjoint and have no code in > common (one searches index_state.name_hash; the other dir_hash), they > can be represented more naturally as distinct functions: one to search > for a file, and one for a directory. > > Splitting index searching into two functions relieves callers of the > artificial burden of having to add or remove a slash to select the mode > of operation; instead they just call the desired function. A subsequent > patch will take advantage of this benefit in order to eliminate the > requirement that the incoming pathname for a directory search must have > a trailing slash. Good thinking. Thanks for working on this; I agree with the general direction this series is going. I however wonder if it would be a good idea to rename the other one to {cache|index}_file_exists(), and even keep the *_name_exists() variant that switches between the two based on the trailing slash, to avoid breaking other topics in flight that may have added new callsites to *_name_exists(). Otherwise the new callsites will feed a path with a trailing slash to *_name_exists() and get a false result, no? > Signed-off-by: Eric Sunshine > --- > cache.h | 2 ++ > dir.c| 7 --- > name-hash.c | 47 --- > read-cache.c | 2 +- > 4 files changed, 31 insertions(+), 27 deletions(-) > > diff --git a/cache.h b/cache.h > index 9ef778a..03ec24c 100644 > --- a/cache.h > +++ b/cache.h > @@ -314,6 +314,7 @@ extern void free_name_hash(struct index_state *istate); > #define refresh_cache(flags) refresh_index(&the_index, (flags), NULL, NULL, > NULL) > #define ce_match_stat(ce, st, options) ie_match_stat(&the_index, (ce), (st), > (options)) > #define ce_modified(ce, st, options) ie_modified(&the_index, (ce), (st), > (options)) > +#define cache_dir_exists(name, namelen) index_dir_exists(&the_index, (name), > (namelen)) > #define cache_name_exists(name, namelen, igncase) > index_name_exists(&the_index, (name), (namelen), (igncase)) > #define cache_name_is_other(name, namelen) index_name_is_other(&the_index, > (name), (namelen)) > #define resolve_undo_clear() resolve_undo_clear_index(&the_index) > @@ -463,6 +464,7 @@ extern int write_index(struct index_state *, int newfd); > extern int discard_index(struct index_state *); > extern int unmerged_index(const struct index_state *); > extern int verify_path(const char *path); > +extern struct cache_entry *index_dir_exists(struct index_state *istate, > const char *name, int namelen); > extern struct cache_entry *index_name_exists(struct index_state *istate, > const char *name, int namelen, int igncase); > extern int index_name_pos(const struct index_state *, const char *name, int > namelen); > #define ADD_CACHE_OK_TO_ADD 1/* Ok to add */ > diff --git a/dir.c b/dir.c > index b439ff0..0080673 100644 > --- a/dir.c > +++ b/dir.c > @@ -860,7 +860,8 @@ static struct dir_entry *dir_entry_new(const char > *pathname, int len) > > static struct dir_entry *dir_add_name(struct dir_struct *dir, const char > *pathname, int len) > { > - if (cache_name_exists(pathname, len, ignore_case)) > + if ((len == 0 || pathname[len - 1] != '/') && > + cache_name_exists(pathname, len, ignore_case)) > return NULL; > > ALLOC_GROW(dir->entries, dir->nr+1, dir->alloc); > @@ -885,11 +886,11 @@ enum exist_status { > /* > * Do not use the alphabetically sorted index to look up > * the directory name; instead, use the case insensitive > - * name hash. > + * directory hash. > */ > static enum exist_status directory_exists_in_index_icase(const char > *dirname, int len) > { > - const struct cache_entry *ce = cache_name_exists(dirname, len + 1, > ignore_case); > + const struct cache_entry *ce = cache_dir_exists(dirname, len + 1); > unsigned char endchar; > > if (!ce) > diff --git a/name-hash.c b/name-hash.c > index 617c86c..5b01554 100644 > --- a/name-hash.c > +++ b/name-hash.c > @@ -222,11 +222,35 @@ static int same_name(const struct cache_entry *ce, > const char *name, int namelen > return slow_same_name(name, namelen, ce->name, len); > } > > +struct cache_entry *index_dir_exists(struct index_state *istate, const char > *name, int namelen) > +{ > + struct cache_entry *ce; > + struct dir_entry *dir; > + > + lazy_init_name_hash(istate); > + dir = find_dir_entry(istate, name, namelen); > + if (dir && dir->nr) > + return dir->ce; > + > + /* > + * It might be a submodule. Unlike plain directories, which are