Re: New integration branch
Adam Spiers wrote: 9c87f2352214175de307efedb8fd93889a26afbc Can you give an example of when this is needed? I can't remember but I definitely saw it happen at least once :-/ My worry is that, since that really shouldn't happen AFIACS, you were actually seeing a bug. Either that or it's a corner case I have not identified. 602f26714254f3c65389b7665d15d1d5d0e227a9 mr is quite typically (I know, not by you) run inside the repository. Which would leave the user in an apparently empty directory after mr update if an mr update deleted and remade the whole repository. I don't like that; I don't think things in mr should be deleting repositories in general; mr doesn't even delete a repo that has deleted = true, it only warns the user about it. Hmm, that's a fair point, although the only alternative is to change the contents of the directory rather than the directory itself - similarly to how `git checkout' does, for instance. I'll see if I can get around to doing that. Perhaps some of the effort could be reused for implementing download_diff (diff against the archive file). I think you could just use rsync :) cf3388f443b9d7afe6dc7d8a2159b45fb01ab4e4 This is a slow way to make machine-parsable info available -- the similar mr list takes 8 seconds here, since it has to run 169 shells. That's ok when you're just running mr, but I would not like to use a command that depended on that information. Sure, that's why I used a simple on-disk cache: https://github.com/aspiers/kitenet-mr/commit/b60acb2e767b91ca6d406198d7eea1b3f73ad2bf It works fine. I could get more sophisticated and allow per-user configuration of the cache invalidation strategy, e.g. so that it would automatically rebuild the cache when ~/.mrconfig et al. are changed, but manual rebuilds aren't a great hardship. In fact I could even rebuild the cache every time mr runs! If a machine-parseable list of repositories is needed, I think it'd be better to have a perl function that emits it in one go. I don't see how that's possible without ignoring the `skip', `deleted', and `include' parameters. The include parameter is not a big problem, it's unlikely to require more than one shell process, which will be relatively fast. It's not clear to me what should be done about skip and deleted. skip in particular can behave in weird ways, when something like hours_since is used. To handle that all the skips would need to be tested, which would be less work than mr list but still verging on expensive. Depending on the application, it might be better to just dump all the defined repositories including skipped and deleted ones; if the consumer than runs mr in a skipped/deleted repo, mr will do something sane after all. -- see shy jo signature.asc Description: Digital signature ___ vcs-home mailing list vcs-home@lists.madduck.net http://lists.madduck.net/listinfo/vcs-home
Re: New integration branch
On Mon, Dec 5, 2011 at 5:38 PM, Joey Hess j...@kitenet.net wrote: Adam Spiers wrote: 9c87f2352214175de307efedb8fd93889a26afbc Can you give an example of when this is needed? I can't remember but I definitely saw it happen at least once :-/ My worry is that, since that really shouldn't happen AFIACS, you were actually seeing a bug. Either that or it's a corner case I have not identified. It's on my TODO list to try to reproduce; I'll let you know if I manage to. 602f26714254f3c65389b7665d15d1d5d0e227a9 mr is quite typically (I know, not by you) run inside the repository. Which would leave the user in an apparently empty directory after mr update if an mr update deleted and remade the whole repository. I don't like that; I don't think things in mr should be deleting repositories in general; mr doesn't even delete a repo that has deleted = true, it only warns the user about it. Hmm, that's a fair point, although the only alternative is to change the contents of the directory rather than the directory itself - similarly to how `git checkout' does, for instance. I'll see if I can get around to doing that. Perhaps some of the effort could be reused for implementing download_diff (diff against the archive file). I think you could just use rsync :) Yeah, that sounds worth trying. If a machine-parseable list of repositories is needed, I think it'd be better to have a perl function that emits it in one go. I don't see how that's possible without ignoring the `skip', `deleted', and `include' parameters. The include parameter is not a big problem, it's unlikely to require more than one shell process, which will be relatively fast. It's not clear to me what should be done about skip and deleted. skip in particular can behave in weird ways, when something like hours_since is used. To handle that all the skips would need to be tested, which would be less work than mr list but still verging on expensive. Depending on the application, it might be better to just dump all the defined repositories including skipped and deleted ones; if the consumer than runs mr in a skipped/deleted repo, mr will do something sane after all. Skipper functions like hours_since could (and probably should) decide not to skip if MR_ACTION is set to a read-only action such list - arguably diff and status too, although that's a matter of personal taste. But maybe we should step back a bit. Currently we know that a full mr list is not particularly fast, but has anyone actually profiled it to find out where most of the time is being spent? If we're only guessing then we might have it completely wrong ... ___ vcs-home mailing list vcs-home@lists.madduck.net http://lists.madduck.net/listinfo/vcs-home
Re: New integration branch
Joey Hess wrote: It could well not be. mr list -j 10 runs in the same time as mr list -j 1, suggesting the overhead is in something else than actually running the shell. Whoops, bad benchmark, -j comes before action. Anyway, yes, without any calls to system(), mr list takes just 0.35 seconds. Those calls are: 169 mr list: running set -e; # actual list command 118 mr skip: running vcs test # 55 mr list: running skip test set -e; 50 mr deleted: running vcs test (Note that the vcs test is split between skip and deleted, but if those features are removed, the actual list command would trigger the same test, so those don't add overhead.) Moving the git_test etc into perl code would be one way to speed it up for the common case. Adding a special case optimisation to avoid the shell for true and false brings mr list down from 8.50 to 1.81 seconds. The remaining time is here spent running skip tests, I have a lot. Probably looking at sub-1-second times for most people. -- see shy jo signature.asc Description: Digital signature ___ vcs-home mailing list vcs-home@lists.madduck.net http://lists.madduck.net/listinfo/vcs-home
Re: New integration branch
Joey Hess wrote: Moving the git_test etc into perl code would be one way to speed it up for the common case. Adding a special case optimisation to avoid the shell for true and false brings mr list down from 8.50 to 1.81 seconds. The remaining time is here spent running skip tests, I have a lot. Probably looking at sub-1-second times for most people. These optimisations are now in place. joey@gnu:~/src/d-itime mr -q list 1.14user 2.17system 0:05.12elapsed 64%CPU (0avgtext+0avgdata 26368maxresident)k 0inputs+0outputs (0major+269034minor)pagefaults 0swaps joey@gnu:~/src/d-itime ~/src/mr/mr -q list 0.38user 0.02system 0:00.44elapsed 91%CPU (0avgtext+0avgdata 26640maxresident)k 0inputs+0outputs (0major+6429minor)pagefaults 0swaps joey@gnu:~time mr -q list 1.67user 3.86system 0:08.75elapsed 63%CPU (0avgtext+0avgdata 26720maxresident)k 0inputs+0outputs (0major+464487minor)pagefaults 0swaps joey@gnu:~time ~/src/mr/mr -q list 0.56user 0.60system 0:01.78elapsed 65%CPU (0avgtext+0avgdata 26800maxresident)k 0inputs+0outputs (0major+84959minor)pagefaults 0swaps -- see shy jo signature.asc Description: Digital signature ___ vcs-home mailing list vcs-home@lists.madduck.net http://lists.madduck.net/listinfo/vcs-home
Re: New integration branch
On Sun, Dec 4, 2011 at 9:11 PM, Joey Hess j...@kitenet.net wrote: 37617a63ec993b128f77a945a2020ec894c58eb1 loadconfig already uses %loaded to avoid reloading the same config twice, so this extra check is not necessary, I think. Ah yes, I missed that. Still, for the cost of an extra line of code, isn't it worth being explicit? If it confused me, then presumably it could confuse other developers in the future. a61c1450ff4b108af26e899a89a1d8ff49cab86c I picked the bugfix part. The warning message on missing chain files exposes an unclear thing in mr; it will try to chain to directories even when their repository has skip = true, which causes the warning to show up unexpectedly (ie, here). I think it needs to be changed to honor skip = true even if chain = true. Ah OK, I'll look into that. b3b68137988e61be1a0f7d90caf05eabf7850f44 I developed a different fix this morning that shows correct line numbers for both the mrconfig and the position in the include, it's in my tree. Yep, I saw that - very nice :) 135e0076c9a93cd0556b9b25ff355ad25546a78c This makes mr fetch do a git fetch, but nothing for the over DVCSes which can also do things like fetch, and no documentation of it I can add documentation. However, although ISTR that `hg pull' is akin to `git fetch', I don't know how it's done for the other decentralised VCS. 9c87f2352214175de307efedb8fd93889a26afbc Can you give an example of when this is needed? I can't remember but I definitely saw it happen at least once :-/ 602f26714254f3c65389b7665d15d1d5d0e227a9 mr is quite typically (I know, not by you) run inside the repository. Which would leave the user in an apparently empty directory after mr update if an mr update deleted and remade the whole repository. I don't like that; I don't think things in mr should be deleting repositories in general; mr doesn't even delete a repo that has deleted = true, it only warns the user about it. Hmm, that's a fair point, although the only alternative is to change the contents of the directory rather than the directory itself - similarly to how `git checkout' does, for instance. I'll see if I can get around to doing that. Perhaps some of the effort could be reused for implementing download_diff (diff against the archive file). 650620d7b6661f9cc59b4adfb6a7d945240fe5c7 f16e51cea8595afc92f3ab9230e3c5a44baed904 I've held off on these plugins since I think they depend on 602f26714254f3c65389b7665d15d1d5d0e227a9 No, only the download plugin. The stow plugin *never* writes to the repository tree, it manages symlinks in an entirely different place (typically $HOME). Having said that, I just remembered that the stow plugin depends on MR_NAME so that will have to wait too. cf3388f443b9d7afe6dc7d8a2159b45fb01ab4e4 This is a slow way to make machine-parsable info available -- the similar mr list takes 8 seconds here, since it has to run 169 shells. That's ok when you're just running mr, but I would not like to use a command that depended on that information. Sure, that's why I used a simple on-disk cache: https://github.com/aspiers/kitenet-mr/commit/b60acb2e767b91ca6d406198d7eea1b3f73ad2bf It works fine. I could get more sophisticated and allow per-user configuration of the cache invalidation strategy, e.g. so that it would automatically rebuild the cache when ~/.mrconfig et al. are changed, but manual rebuilds aren't a great hardship. In fact I could even rebuild the cache every time mr runs! If a machine-parseable list of repositories is needed, I think it'd be better to have a perl function that emits it in one go. I don't see how that's possible without ignoring the `skip', `deleted', and `include' parameters. (Also, the patch references a MR_NAME that is not present in my mr tree.) Yeah :-( I had to rebase about 20 times to separate out the patches you are currently willing to consider, so mistakes were inevitable. 4cd2b59d0c66d71316dfc1d411a3e3da439643bc I'm not quite sure of the point of this refactoring, Legibility and modularity. Longer functions makes unfamiliar code harder to understand, and often give variables an unnecessarily wide scope. If I see this line of code within bootstrap(): my $tmpconfig = download($url); then (a) it's immediately obvious that $url contains config which is being downloaded and stored in a temporary file, so there's no need for the existing comment Download the config file to a temporary location, (b) it makes it easy _and_optional_ for me to view how the download is implemented, and (c) it means that the scope of @curlargs and $curlstatus are clearly limited to the download, so I don't need to visually grep for them in