Re: [PATCH/RFC v3 0/4] Improving performance of git clean
On Tue, Apr 21, 2015 at 11:24 PM, Jeff King p...@peff.net wrote: If I understand correctly, the reason that you need per-run setup is that your git clean command actually cleans things, and you need to restore the original state for each time-trial. Can you instead use git clean -n to do a dry-run? I think what you are timing is really the figure out what to clean step, and not the cleaning itself. -Peff Yes, that is the problem. A dry run will spot this particular performance issue but maybe we lose some value as a general performance test if we only do half the clean? Admittedly we clearly lose some value in the current state as well due to the copying taking more time than the cleaning. I could go either way here. /Erik -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC v3 0/4] Improving performance of git clean
On Wed, Apr 22, 2015 at 09:30:20PM +0200, erik elfström wrote: On Tue, Apr 21, 2015 at 11:24 PM, Jeff King p...@peff.net wrote: If I understand correctly, the reason that you need per-run setup is that your git clean command actually cleans things, and you need to restore the original state for each time-trial. Can you instead use git clean -n to do a dry-run? I think what you are timing is really the figure out what to clean step, and not the cleaning itself. Yes, that is the problem. A dry run will spot this particular performance issue but maybe we lose some value as a general performance test if we only do half the clean? Admittedly we clearly lose some value in the current state as well due to the copying taking more time than the cleaning. I could go either way here. I guess it is a matter of opinion. I think testing only the find out what to clean half separately is actually beneficial, because it helps us isolate any slowdown. If we want to add a test for the other half, we can, but I do not actually think it is currently that interesting (it is just calling unlink() in a loop). So even leaving the practical matters aside, I do not think it is a bad thing to split it up. When you add in the fact that it is practically much easier to test the first half, it seems to me that testing just that is a good first step. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC v3 0/4] Improving performance of git clean
On Wed, Apr 22, 2015 at 9:46 PM, Jeff King p...@peff.net wrote: On Wed, Apr 22, 2015 at 09:30:20PM +0200, erik elfström wrote: Yes, that is the problem. A dry run will spot this particular performance issue but maybe we lose some value as a general performance test if we only do half the clean? Admittedly we clearly lose some value in the current state as well due to the copying taking more time than the cleaning. I could go either way here. I guess it is a matter of opinion. I think testing only the find out what to clean half separately is actually beneficial, because it helps us isolate any slowdown. If we want to add a test for the other half, we can, but I do not actually think it is currently that interesting (it is just calling unlink() in a loop). So even leaving the practical matters aside, I do not think it is a bad thing to split it up. When you add in the fact that it is practically much easier to test the first half, it seems to me that testing just that is a good first step. -Peff Sounds reasonable to me. I'll make this change in v4, thanks! (Sorry for the duplicate email Jeff, I'm bad at this mailing list thing...) /Erik -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC v3 0/4] Improving performance of git clean
erik elfström erik.elfst...@gmail.com writes: Ok, thanks for looking into this. I have no well founded opinions on the implementation but I do think the performance tests would be more meaningful if the setup/cleanup code could be removed from the timed section. If the community agrees on an implementation I would be happy to convert the new tests, either directly in this series or as a follow up if that is preferred. Let's not delay the fix and do the perf thing as a follow-up series, possibly an even independent one. In other words, let's keep the topics small. /Erik On Tue, Apr 21, 2015 at 12:14 AM, Thomas Gummerer t.gumme...@gmail.com wrote: On 04/18, Erik Elfström wrote: * Still have issues in the performance tests, see comments from Thomas Gummerer on v2 I've looked at the modern style tests again, and I don't the code churn is worth it just for using them for the performance tests. If anyone wants to take a look at the code, it's at github.com/tgummerer/git tg/perf-lib. I think adding the test_perf_setup_cleanup command would make more sense in this case. If you want I can send a patch for that. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC v3 0/4] Improving performance of git clean
On Sun, Apr 19, 2015 at 3:14 AM, Junio C Hamano gits...@pobox.com wrote: Erik Elfström erik.elfst...@gmail.com writes: Known Problems: * Unsure about the setup.c:read_gitfile refactor, feels a bit messy? The interface indeed feels somewhat messy. I suspect that a better interface might be more like setup_git_directory_gently() that is a gentler version of setup_git_directory(). The function takes an optional pointer to an int, and it behaves not-gently-at-all when the optional argument is NULL. The location the optional pointer points at, when it is not NULL, can be used to return the error state to the caller. That function pair only uses the optional argument to convey one bit of information (i.e. are we in any git repository or not?) back to the caller, but the interface could be used to tell the caller a lot more if we wanted to. If you model read_gitfile_gently() after that pattern, I would expect that - The extra pattern would be int *error; - The implementation of read_gitfile() would be #define read_gitfile(path) read_gitfile_gently((path), NULL) and the _gently() version will die when error parameter is set to NULL and finds any error. - The caller of the gentle variant can use the error code to determine what went wrong, not just the fact that it failed. I do not think your caller does not have an immediate need to tell between invalid gitfile format and No path in gitfile, but such an interface leaves that possibility open. Thanks. Ok, I'll try that approach, thanks. (and apologies for the late reply) /Erik -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC v3 0/4] Improving performance of git clean
Ok, thanks for looking into this. I have no well founded opinions on the implementation but I do think the performance tests would be more meaningful if the setup/cleanup code could be removed from the timed section. If the community agrees on an implementation I would be happy to convert the new tests, either directly in this series or as a follow up if that is preferred. /Erik On Tue, Apr 21, 2015 at 12:14 AM, Thomas Gummerer t.gumme...@gmail.com wrote: On 04/18, Erik Elfström wrote: * Still have issues in the performance tests, see comments from Thomas Gummerer on v2 I've looked at the modern style tests again, and I don't the code churn is worth it just for using them for the performance tests. If anyone wants to take a look at the code, it's at github.com/tgummerer/git tg/perf-lib. I think adding the test_perf_setup_cleanup command would make more sense in this case. If you want I can send a patch for that. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC v3 0/4] Improving performance of git clean
On Tue, Apr 21, 2015 at 08:21:37PM +0200, erik elfström wrote: Ok, thanks for looking into this. I have no well founded opinions on the implementation but I do think the performance tests would be more meaningful if the setup/cleanup code could be removed from the timed section. If the community agrees on an implementation I would be happy to convert the new tests, either directly in this series or as a follow up if that is preferred. If I understand correctly, the reason that you need per-run setup is that your git clean command actually cleans things, and you need to restore the original state for each time-trial. Can you instead use git clean -n to do a dry-run? I think what you are timing is really the figure out what to clean step, and not the cleaning itself. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC v3 0/4] Improving performance of git clean
On 04/18, Erik Elfström wrote: * Still have issues in the performance tests, see comments from Thomas Gummerer on v2 I've looked at the modern style tests again, and I don't the code churn is worth it just for using them for the performance tests. If anyone wants to take a look at the code, it's at github.com/tgummerer/git tg/perf-lib. I think adding the test_perf_setup_cleanup command would make more sense in this case. If you want I can send a patch for that. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC v3 0/4] Improving performance of git clean
Erik Elfström erik.elfst...@gmail.com writes: Known Problems: * Unsure about the setup.c:read_gitfile refactor, feels a bit messy? The interface indeed feels somewhat messy. I suspect that a better interface might be more like setup_git_directory_gently() that is a gentler version of setup_git_directory(). The function takes an optional pointer to an int, and it behaves not-gently-at-all when the optional argument is NULL. The location the optional pointer points at, when it is not NULL, can be used to return the error state to the caller. That function pair only uses the optional argument to convey one bit of information (i.e. are we in any git repository or not?) back to the caller, but the interface could be used to tell the caller a lot more if we wanted to. If you model read_gitfile_gently() after that pattern, I would expect that - The extra pattern would be int *error; - The implementation of read_gitfile() would be #define read_gitfile(path) read_gitfile_gently((path), NULL) and the _gently() version will die when error parameter is set to NULL and finds any error. - The caller of the gentle variant can use the error code to determine what went wrong, not just the fact that it failed. I do not think your caller does not have an immediate need to tell between invalid gitfile format and No path in gitfile, but such an interface leaves that possibility open. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html