Nguyễn Thái Ngọc Duy  <pclo...@gmail.com> writes:

> In checkout_paths() we do this
>
>  - for all updated items, call match_pathspec
>  - for all items, call match_pathspec (inside unmerge_cache)
>  - for all items, call match_pathspec (for showing "path .. is unmerged)
>  - for updated items, call match_pathspec and update paths
>
> That's a lot of duplicate match_pathspec(s) and the function is not
> exactly cheap to be called so many times, especially on large indexes.
> This patch makes it call match_pathspec once per updated index entry,
> save the result in ce_flags and reuse the results in the following
> loops.
>
> The changes in 0a1283b (checkout $tree $path: do not clobber local
> changes in $path not in $tree - 2011-09-30) limit the affected paths
> to ones we read from $tree. We do not do anything to other modified
> entries in this case, so the "for all items" above could be modified
> to "for all updated items". But..
>
> The command's behavior now is modified slightly: unmerged entries that
> match $path, but not updated by $tree, are now NOT touched.  Although
> this should be considered a bug fix, not a regression.

Could we have a test to show the difference please, especially if we
are going to sell this as a fix?

The change itself looks quite sane to me (I didn't apply or test it,
though---just eyeballing).

Thanks.

>
> And while at there, free ps_matched after use.
>
> The following command is tested on webkit, 215k entries. The pattern
> is chosen mainly to make match_pathspec sweat:
>
> git checkout -- "*[a-zA-Z]*[a-zA-Z]*[a-zA-Z]*"
>
>         before      after
> real    0m3.493s    0m2.737s
> user    0m2.239s    0m1.586s
> sys     0m1.252s    0m1.151s
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclo...@gmail.com>
> ---
>  > and worry about larger ones later, so if there were another choice,
>  > i.e.
>  >
>  >  - eject nd/magic-pathspecs for now, cook this (and other small
>  >    independent and clear improvements you may come up with, some of
>  >    which might come out of nd/magic-pathspecs itself) and let it
>  >    graduate first, and later revisit rerolld nd/magic-pathspecs
>  >
>  > that would be the ideal among the given choices ;-).
>
>  Whichever is easier for you.
>
>  > The above is a faithful rewrite, but I have to wonder why you need
>  > two separate loops.
>  >
>  > Do you understand what the original loop is doing with ps_matched,
>  > and why the code excludes certain paths while doing so?
>
>  Nope, I did not dig that deep. I expected you to do it ;-) j/k
>
>  > After commenting on the above, it makes me wonder if we even need to
>  > bother marking entries that were in the index that did not come from
>  > the tree-ish we are checking paths out of, though.  What breaks if
>  > you did not do the rewrite above and dropped the second loop in your
>  > patch?
>
>  The test suite says none. There is a behavior change regarding
>  unmerged entries as mentioned in the commit message. But I think it's
>  a good change.
>
>  builtin/checkout.c | 34 +++++++++++++++++++++++++++-------
>  cache.h            |  1 +
>  resolve-undo.c     | 19 ++++++++++++++++++-
>  resolve-undo.h     |  1 +
>  4 files changed, 47 insertions(+), 8 deletions(-)
>
> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index a9c1b5a..359b983 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -271,24 +271,46 @@ static int checkout_paths(const struct checkout_opts 
> *opts,
>               ;
>       ps_matched = xcalloc(1, pos);
>  
> +     /*
> +      * Make sure all pathspecs participated in locating the paths
> +      * to be checked out.
> +      */
>       for (pos = 0; pos < active_nr; pos++) {
>               struct cache_entry *ce = active_cache[pos];
> +             ce->ce_flags &= ~CE_MATCHED;
>               if (opts->source_tree && !(ce->ce_flags & CE_UPDATE))
> +                     /*
> +                      * "git checkout tree-ish -- path", but this entry
> +                      * is in the original index; it will not be checked
> +                      * out to the working tree and it does not matter
> +                      * if pathspec matched this entry.  We will not do
> +                      * anything to this entry at all.
> +                      */
>                       continue;
> -             match_pathspec(opts->pathspec, ce->name, ce_namelen(ce), 0, 
> ps_matched);
> +             /*
> +              * Either this entry came from the tree-ish we are
> +              * checking the paths out of, or we are checking out
> +              * of the index.
> +              */
> +             if (match_pathspec(opts->pathspec, ce->name, ce_namelen(ce),
> +                                0, ps_matched))
> +                     ce->ce_flags |= CE_MATCHED;
>       }
>  
> -     if (report_path_error(ps_matched, opts->pathspec, opts->prefix))
> +     if (report_path_error(ps_matched, opts->pathspec, opts->prefix)) {
> +             free(ps_matched);
>               return 1;
> +     }
> +     free(ps_matched);
>  
>       /* "checkout -m path" to recreate conflicted state */
>       if (opts->merge)
> -             unmerge_cache(opts->pathspec);
> +             unmerge_marked_index(&the_index);
>  
>       /* Any unmerged paths? */
>       for (pos = 0; pos < active_nr; pos++) {
>               struct cache_entry *ce = active_cache[pos];
> -             if (match_pathspec(opts->pathspec, ce->name, ce_namelen(ce), 0, 
> NULL)) {
> +             if (ce->ce_flags & CE_MATCHED) {
>                       if (!ce_stage(ce))
>                               continue;
>                       if (opts->force) {
> @@ -313,9 +335,7 @@ static int checkout_paths(const struct checkout_opts 
> *opts,
>       state.refresh_cache = 1;
>       for (pos = 0; pos < active_nr; pos++) {
>               struct cache_entry *ce = active_cache[pos];
> -             if (opts->source_tree && !(ce->ce_flags & CE_UPDATE))
> -                     continue;
> -             if (match_pathspec(opts->pathspec, ce->name, ce_namelen(ce), 0, 
> NULL)) {
> +             if (ce->ce_flags & CE_MATCHED) {
>                       if (!ce_stage(ce)) {
>                               errs |= checkout_entry(ce, &state, NULL);
>                               continue;
> diff --git a/cache.h b/cache.h
> index c56315c..04e6090 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -161,6 +161,7 @@ struct cache_entry {
>  
>  #define CE_UNPACKED          (1 << 24)
>  #define CE_NEW_SKIP_WORKTREE (1 << 25)
> +#define CE_MATCHED           (1 << 26)
>  
>  /*
>   * Extended on-disk flags
> diff --git a/resolve-undo.c b/resolve-undo.c
> index 72b4612..639eb9c 100644
> --- a/resolve-undo.c
> +++ b/resolve-undo.c
> @@ -118,7 +118,7 @@ int unmerge_index_entry_at(struct index_state *istate, 
> int pos)
>       struct cache_entry *ce;
>       struct string_list_item *item;
>       struct resolve_undo_info *ru;
> -     int i, err = 0;
> +     int i, err = 0, matched;
>  
>       if (!istate->resolve_undo)
>               return pos;
> @@ -137,6 +137,7 @@ int unmerge_index_entry_at(struct index_state *istate, 
> int pos)
>       ru = item->util;
>       if (!ru)
>               return pos;
> +     matched = ce->ce_flags & CE_MATCHED;
>       remove_index_entry_at(istate, pos);
>       for (i = 0; i < 3; i++) {
>               struct cache_entry *nce;
> @@ -144,6 +145,8 @@ int unmerge_index_entry_at(struct index_state *istate, 
> int pos)
>                       continue;
>               nce = make_cache_entry(ru->mode[i], ru->sha1[i],
>                                      ce->name, i + 1, 0);
> +             if (matched)
> +                     nce->ce_flags |= CE_MATCHED;
>               if (add_index_entry(istate, nce, ADD_CACHE_OK_TO_ADD)) {
>                       err = 1;
>                       error("cannot unmerge '%s'", ce->name);
> @@ -156,6 +159,20 @@ int unmerge_index_entry_at(struct index_state *istate, 
> int pos)
>       return unmerge_index_entry_at(istate, pos);
>  }
>  
> +void unmerge_marked_index(struct index_state *istate)
> +{
> +     int i;
> +
> +     if (!istate->resolve_undo)
> +             return;
> +
> +     for (i = 0; i < istate->cache_nr; i++) {
> +             struct cache_entry *ce = istate->cache[i];
> +             if (ce->ce_flags & CE_MATCHED)
> +                     i = unmerge_index_entry_at(istate, i);
> +     }
> +}
> +
>  void unmerge_index(struct index_state *istate, const char **pathspec)
>  {
>       int i;
> diff --git a/resolve-undo.h b/resolve-undo.h
> index 8458769..7a30206 100644
> --- a/resolve-undo.h
> +++ b/resolve-undo.h
> @@ -12,5 +12,6 @@ extern struct string_list *resolve_undo_read(const char *, 
> unsigned long);
>  extern void resolve_undo_clear_index(struct index_state *);
>  extern int unmerge_index_entry_at(struct index_state *, int);
>  extern void unmerge_index(struct index_state *, const char **);
> +extern void unmerge_marked_index(struct index_state *);
>  
>  #endif
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to