Re: [PATCH/RFC v3 0/4] Improving performance of git clean

2015-04-22 Thread erik elfström
On Tue, Apr 21, 2015 at 11:24 PM, Jeff King p...@peff.net wrote:

 If I understand correctly, the reason that you need per-run setup is
 that your git clean command actually cleans things, and you need to
 restore the original state for each time-trial. Can you instead use git
 clean -n to do a dry-run? I think what you are timing is really the
 figure out what to clean step, and not the cleaning itself.

 -Peff


Yes, that is the problem. A dry run will spot this particular performance
issue but maybe we lose some value as a general performance test if
we only do half the clean? Admittedly we clearly lose some value in
the current state as well due to the copying taking more time than the
cleaning. I could go either way here.

/Erik
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v3 0/4] Improving performance of git clean

2015-04-22 Thread Jeff King
On Wed, Apr 22, 2015 at 09:30:20PM +0200, erik elfström wrote:

 On Tue, Apr 21, 2015 at 11:24 PM, Jeff King p...@peff.net wrote:
 
  If I understand correctly, the reason that you need per-run setup is
  that your git clean command actually cleans things, and you need to
  restore the original state for each time-trial. Can you instead use git
  clean -n to do a dry-run? I think what you are timing is really the
  figure out what to clean step, and not the cleaning itself.
 
 Yes, that is the problem. A dry run will spot this particular performance
 issue but maybe we lose some value as a general performance test if
 we only do half the clean? Admittedly we clearly lose some value in
 the current state as well due to the copying taking more time than the
 cleaning. I could go either way here.

I guess it is a matter of opinion. I think testing only the find out
what to clean half separately is actually beneficial, because it helps
us isolate any slowdown. If we want to add a test for the other half, we
can, but I do not actually think it is currently that interesting (it is
just calling unlink() in a loop).

So even leaving the practical matters aside, I do not think it is a bad
thing to split it up. When you add in the fact that it is practically
much easier to test the first half, it seems to me that testing just
that is a good first step.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v3 0/4] Improving performance of git clean

2015-04-22 Thread erik elfström
On Wed, Apr 22, 2015 at 9:46 PM, Jeff King p...@peff.net wrote:
 On Wed, Apr 22, 2015 at 09:30:20PM +0200, erik elfström wrote:

 Yes, that is the problem. A dry run will spot this particular performance
 issue but maybe we lose some value as a general performance test if
 we only do half the clean? Admittedly we clearly lose some value in
 the current state as well due to the copying taking more time than the
 cleaning. I could go either way here.

 I guess it is a matter of opinion. I think testing only the find out
 what to clean half separately is actually beneficial, because it helps
 us isolate any slowdown. If we want to add a test for the other half, we
 can, but I do not actually think it is currently that interesting (it is
 just calling unlink() in a loop).

 So even leaving the practical matters aside, I do not think it is a bad
 thing to split it up. When you add in the fact that it is practically
 much easier to test the first half, it seems to me that testing just
 that is a good first step.

 -Peff

Sounds reasonable to me. I'll make this change in v4, thanks!

(Sorry for the duplicate email Jeff, I'm bad at this mailing list thing...)

/Erik
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v3 0/4] Improving performance of git clean

2015-04-21 Thread Junio C Hamano
erik elfström erik.elfst...@gmail.com writes:

 Ok, thanks for looking into this.

 I have no well founded opinions on the implementation but I do
 think the performance tests would be more meaningful if the
 setup/cleanup code could be removed from the timed section.
 If the community agrees on an implementation I would be happy
 to convert the new tests, either directly in this series or as a follow
 up if that is preferred.

Let's not delay the fix and do the perf thing as a follow-up series,
possibly an even independent one.

In other words, let's keep the topics small.


 /Erik

 On Tue, Apr 21, 2015 at 12:14 AM, Thomas Gummerer t.gumme...@gmail.com 
 wrote:
 On 04/18, Erik Elfström wrote:
 * Still have issues in the performance tests, see comments
   from Thomas Gummerer on v2

 I've looked at the modern style tests again, and I don't the code
 churn is worth it just for using them for the performance tests.  If
 anyone wants to take a look at the code, it's at
 github.com/tgummerer/git tg/perf-lib.

 I think adding the test_perf_setup_cleanup command would make more
 sense in this case.  If you want I can send a patch for that.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v3 0/4] Improving performance of git clean

2015-04-21 Thread erik elfström
On Sun, Apr 19, 2015 at 3:14 AM, Junio C Hamano gits...@pobox.com wrote:
 Erik Elfström erik.elfst...@gmail.com writes:

 Known Problems:
 * Unsure about the setup.c:read_gitfile refactor, feels a bit
   messy?

 The interface indeed feels somewhat messy.  I suspect that a better
 interface might be more like setup_git_directory_gently() that is a
 gentler version of setup_git_directory().  The function takes an
 optional pointer to an int, and it behaves not-gently-at-all when
 the optional argument is NULL.  The location the optional pointer
 points at, when it is not NULL, can be used to return the error
 state to the caller.  That function pair only uses the optional
 argument to convey one bit of information (i.e. are we in any git
 repository or not?) back to the caller, but the interface could be
 used to tell the caller a lot more if we wanted to.

 If you model read_gitfile_gently() after that pattern, I would
 expect that

  - The extra pattern would be int *error;
  - The implementation of read_gitfile() would be

#define read_gitfile(path) read_gitfile_gently((path), NULL)

and the _gently() version will die when error parameter is set
to NULL and finds any error.

  - The caller of the gentle variant can use the error code to
determine what went wrong, not just the fact that it failed.  I
do not think your caller does not have an immediate need to tell
between invalid gitfile format and No path in gitfile, but
such an interface leaves that possibility open.

 Thanks.


Ok, I'll try that approach, thanks. (and apologies for the late reply)

/Erik
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v3 0/4] Improving performance of git clean

2015-04-21 Thread erik elfström
Ok, thanks for looking into this.

I have no well founded opinions on the implementation but I do
think the performance tests would be more meaningful if the
setup/cleanup code could be removed from the timed section.
If the community agrees on an implementation I would be happy
to convert the new tests, either directly in this series or as a follow
up if that is preferred.

/Erik

On Tue, Apr 21, 2015 at 12:14 AM, Thomas Gummerer t.gumme...@gmail.com wrote:
 On 04/18, Erik Elfström wrote:
 * Still have issues in the performance tests, see comments
   from Thomas Gummerer on v2

 I've looked at the modern style tests again, and I don't the code
 churn is worth it just for using them for the performance tests.  If
 anyone wants to take a look at the code, it's at
 github.com/tgummerer/git tg/perf-lib.

 I think adding the test_perf_setup_cleanup command would make more
 sense in this case.  If you want I can send a patch for that.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v3 0/4] Improving performance of git clean

2015-04-21 Thread Jeff King
On Tue, Apr 21, 2015 at 08:21:37PM +0200, erik elfström wrote:

 Ok, thanks for looking into this.
 
 I have no well founded opinions on the implementation but I do
 think the performance tests would be more meaningful if the
 setup/cleanup code could be removed from the timed section.
 If the community agrees on an implementation I would be happy
 to convert the new tests, either directly in this series or as a follow
 up if that is preferred.

If I understand correctly, the reason that you need per-run setup is
that your git clean command actually cleans things, and you need to
restore the original state for each time-trial. Can you instead use git
clean -n to do a dry-run? I think what you are timing is really the
figure out what to clean step, and not the cleaning itself.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v3 0/4] Improving performance of git clean

2015-04-20 Thread Thomas Gummerer
On 04/18, Erik Elfström wrote:
 * Still have issues in the performance tests, see comments
   from Thomas Gummerer on v2

I've looked at the modern style tests again, and I don't the code
churn is worth it just for using them for the performance tests.  If
anyone wants to take a look at the code, it's at
github.com/tgummerer/git tg/perf-lib.

I think adding the test_perf_setup_cleanup command would make more
sense in this case.  If you want I can send a patch for that.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v3 0/4] Improving performance of git clean

2015-04-18 Thread Junio C Hamano
Erik Elfström erik.elfst...@gmail.com writes:

 Known Problems:
 * Unsure about the setup.c:read_gitfile refactor, feels a bit
   messy?

The interface indeed feels somewhat messy.  I suspect that a better
interface might be more like setup_git_directory_gently() that is a
gentler version of setup_git_directory().  The function takes an
optional pointer to an int, and it behaves not-gently-at-all when
the optional argument is NULL.  The location the optional pointer
points at, when it is not NULL, can be used to return the error
state to the caller.  That function pair only uses the optional
argument to convey one bit of information (i.e. are we in any git
repository or not?) back to the caller, but the interface could be
used to tell the caller a lot more if we wanted to.

If you model read_gitfile_gently() after that pattern, I would
expect that

 - The extra pattern would be int *error;
 - The implementation of read_gitfile() would be

   #define read_gitfile(path) read_gitfile_gently((path), NULL)

   and the _gently() version will die when error parameter is set
   to NULL and finds any error.

 - The caller of the gentle variant can use the error code to
   determine what went wrong, not just the fact that it failed.  I
   do not think your caller does not have an immediate need to tell
   between invalid gitfile format and No path in gitfile, but
   such an interface leaves that possibility open.

Thanks.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html