Re: Lost Repository Was: Removing git-annex Repo

2011-12-05 Thread Klaus Ethgen
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

Am Mo den  5. Dez 2011 um 17:19 schrieb Joey Hess:
 Klaus Ethgen wrote:
   Can you check out the git-annex branch and run git-log on uuid.log, 
   and see what the most recent change to it looked like?
  
  It is an update. After that I revert this update and the next time it
  will purged again.
 
 It sort of sounds as if you are checking out the git-annex branch,
 manually editing and committing a file, and seeing git-annex revert
 that change.

Well, after the Bug happens several times I started to do the following
after git-annex removed the occurrences:
   git checkout git-annex
   git cherry-pick id # This id is the first time I fixed it by hand
   git add trust.log uuid.log
   git commit -m Correcting update
   git checkout master

The patch looks somethink like:
   diff --git a/trust.log b/trust.log
   index xxx..xxx 100644
   --- a/trust.log
   +++ b/trust.log
   @@ -1 +1,2 @@
---- 1 timestamp=1322867090.765867s
   +---- 0 timestamp=1322867165.394761s
   diff --git a/uuid.log b/uuid.log
   index xxx..xxx 100644
   --- a/uuid.log
   +++ b/uuid.log
   @@ -1,2 +1,3 @@
   ----- Backup timestamp=1322866827.929813s
---- Master timestamp=1322866770.445515s
   +---- Backup timestamp=1322866827.929813s
   +---- Clone timestamp=1322867722.827595s

Which is the revert of this two files.

 That's expected behavior actually..
 http://git-annex.branchable.com/internals/ explains why. 

Sorry, no it don't. I do not want to modify the git-annex branch as I
know it is internal. But the situation gives me no other choice than to
revert it every and every time.

Regards
   Klaus
- -- 
Klaus Ethgen  http://www.ethgen.ch/
pub  4096R/4E20AF1C 2011-05-16   Klaus Ethgen kl...@ethgen.de
Fingerprint: 85D4 CA42 952C 949B 1753  62B3 79D0 B06F 4E20 AF1C
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)

iQGcBAEBCgAGBQJO3PVpAAoJEKZ8CrGAGfasAPcL/2G12mXQWQcNaDO6tEdL7hkJ
CM5IA8xnxN4cOIWry3YbfDePwQ4Q3/rPnCEj/epGP7QIYB+tatPjJCCz+2ivprc7
GyZtih+C8cwYpad/T/QKEEAM2txMR2uy2kkGy43aFaCN2YRC/2KDmK5ePfxgcRTJ
W+U5VyNu8Aury73WzNGc41e8R/Uple8QZz/r9fvP5c23MtNB83229cjMNBauw4Q6
IHuM0tBNDSY22rZ0MG7WRFgtzgPOZjGsShMVn1TJFpelTheOsCtc0GVjkbwVaGrF
IND2Vo23kBFBlc6vy7g99lra7qAoAZxptfGbqZaMKEWXYVmAMiCB+KiBYE5CcH58
2e4sRbPj2xkVITD9RNlWrH7e/amBpf1w5a+i/gNmHMAGYvX0vYS2abwz+FdwqoQ2
ALhycZFig3kmYlDSw+64lS98j9TwAXdZ2IoBCwnJHRZrbrOLhcfBpmncZRW9TRa6
bW56MR5zZUzMpq8L4xtacmxKputo1WGsNhWXftpOGA==
=uWye
-END PGP SIGNATURE-
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home


Re: New integration branch

2011-12-05 Thread Joey Hess
Adam Spiers wrote:
  9c87f2352214175de307efedb8fd93889a26afbc
         Can you give an example of when this is needed?
 
 I can't remember but I definitely saw it happen at least once :-/

My worry is that, since that really shouldn't happen AFIACS, you
were actually seeing a bug. Either that or it's a corner case I have not
identified.

  602f26714254f3c65389b7665d15d1d5d0e227a9
         mr is quite typically (I know, not by you) run
         inside the repository. Which would leave the user
         in an apparently empty directory after mr update if
         an mr update deleted and remade the whole repository.
 
         I don't like that; I don't think things in mr should be
         deleting repositories in general; mr doesn't even delete
         a repo that has deleted = true, it only warns the user about it.
 
 Hmm, that's a fair point, although the only alternative is to change
 the contents of the directory rather than the directory itself -
 similarly to how `git checkout' does, for instance.  I'll see if I can
 get around to doing that.  Perhaps some of the effort could be reused
 for implementing download_diff (diff against the archive file).

I think you could just use rsync :)

  cf3388f443b9d7afe6dc7d8a2159b45fb01ab4e4
         This is a slow way to make machine-parsable info available --
         the similar mr list takes 8 seconds here, since it has to run
         169 shells. That's ok when you're just running mr, but I would
         not like to use a command that depended on that information.
 
 Sure, that's why I used a simple on-disk cache:
 
   
 https://github.com/aspiers/kitenet-mr/commit/b60acb2e767b91ca6d406198d7eea1b3f73ad2bf
 
 It works fine.  I could get more sophisticated and allow per-user
 configuration of the cache invalidation strategy, e.g. so that it
 would automatically rebuild the cache when ~/.mrconfig et al. are
 changed, but manual rebuilds aren't a great hardship.  In fact I could
 even rebuild the cache every time mr runs!
 
         If a machine-parseable list of repositories is needed,
         I think it'd be better to have a perl function that emits
         it in one go.
 
 I don't see how that's possible without ignoring the `skip',
 `deleted', and `include' parameters.

The include parameter is not a big problem, it's unlikely to require
more than one shell process, which will be relatively fast.

It's not clear to me what should be done about skip and deleted.
skip in particular can behave in weird ways, when something like
hours_since is used. To handle that all the skips would need to be
tested, which would be less work than mr list but still verging on
expensive. 

Depending on the application, it might be better to just dump all the
defined repositories including skipped and deleted ones; if the consumer
than runs mr in a skipped/deleted repo, mr will do something sane after
all.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: alternative mechanisms for including config

2011-12-05 Thread Joey Hess
Adam Spiers wrote:
 This may be a good time to discuss the design of the `include'
 parameter.  When you were deciding what its value should be, I guess
 there were at least three possibilities:
 
 (1) a chunk of shell-code which returns the actual shell-code to
 include
 
 (2) a chunk of shell-code which returns a list of names of files
 to include
 
 (3) a delimited list of files to include
 
 You went with (1).  One advantage of this is the ability to
 dynamically generate code to include.  But this could also be achieved
 with (2), by generating the files to include and then returning the
 names of the generated files.  Also, with (1), if the shell-code has
 an issue it's harder to debug because there's no containing file (and
 line number and surrounding lines) to refer to.  The main advantage of
 (3) is that you don't have to execute any shell code at all.  This
 would facilitate implementation of your suggestion of writing a Perl
 function to emit the repo list, although there's still the problem of
 the `skip' parameter, and I suspect too many users are already relying
 on the dynamic nature of `include' for (3) to be feasible.
 
 But might it be worth implementing (2) alongside the existing (1), via
 a new `includefiles' special parameter?

I've made mr show the included line content in error messages now.
The speed hit of running that one shell command is minor.
It doesn't seem worth bothering users with deprecating the current
include, and needless complication to have a separate way with a list of
files.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: New integration branch

2011-12-05 Thread Adam Spiers
On Mon, Dec 5, 2011 at 5:38 PM, Joey Hess j...@kitenet.net wrote:
 Adam Spiers wrote:
  9c87f2352214175de307efedb8fd93889a26afbc
         Can you give an example of when this is needed?

 I can't remember but I definitely saw it happen at least once :-/

 My worry is that, since that really shouldn't happen AFIACS, you
 were actually seeing a bug. Either that or it's a corner case I have not
 identified.

It's on my TODO list to try to reproduce; I'll let you know if I
manage to.

  602f26714254f3c65389b7665d15d1d5d0e227a9
         mr is quite typically (I know, not by you) run
         inside the repository. Which would leave the user
         in an apparently empty directory after mr update if
         an mr update deleted and remade the whole repository.
 
         I don't like that; I don't think things in mr should be
         deleting repositories in general; mr doesn't even delete
         a repo that has deleted = true, it only warns the user about it.

 Hmm, that's a fair point, although the only alternative is to change
 the contents of the directory rather than the directory itself -
 similarly to how `git checkout' does, for instance.  I'll see if I can
 get around to doing that.  Perhaps some of the effort could be reused
 for implementing download_diff (diff against the archive file).

 I think you could just use rsync :)

Yeah, that sounds worth trying.

         If a machine-parseable list of repositories is needed,
         I think it'd be better to have a perl function that emits
         it in one go.

 I don't see how that's possible without ignoring the `skip',
 `deleted', and `include' parameters.

 The include parameter is not a big problem, it's unlikely to require
 more than one shell process, which will be relatively fast.

 It's not clear to me what should be done about skip and deleted.
 skip in particular can behave in weird ways, when something like
 hours_since is used. To handle that all the skips would need to be
 tested, which would be less work than mr list but still verging on
 expensive.

 Depending on the application, it might be better to just dump all the
 defined repositories including skipped and deleted ones; if the consumer
 than runs mr in a skipped/deleted repo, mr will do something sane after
 all.

Skipper functions like hours_since could (and probably should) decide
not to skip if MR_ACTION is set to a read-only action such list -
arguably diff and status too, although that's a matter of personal
taste.

But maybe we should step back a bit.  Currently we know that a full
mr list is not particularly fast, but has anyone actually profiled
it to find out where most of the time is being spent?  If we're only
guessing then we might have it completely wrong ...
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home


Re: New integration branch

2011-12-05 Thread Joey Hess
Joey Hess wrote:
 It could well not be. mr list -j 10 runs in the same time as mr list -j 1,
 suggesting the overhead is in something else than actually running the
 shell.

Whoops, bad benchmark, -j comes before action.

Anyway, yes, without any calls to system(), mr list takes just 0.35 seconds.
Those calls are:

169 mr list: running set -e;  # actual list command
118 mr skip: running vcs test  # 
 55 mr list: running skip test set -e;
 50 mr deleted: running vcs test 

(Note that the vcs test is split between skip and deleted, but
if those features are removed, the actual list command would
trigger the same test, so those don't add overhead.)

Moving the git_test etc into perl code would be one way to speed it up
for the common case. Adding a special case optimisation to avoid the shell
for true and false brings mr list down from 8.50 to 1.81 seconds.
The remaining time is here spent running skip tests, I have a lot. Probably
looking at sub-1-second times for most people.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: New integration branch

2011-12-05 Thread Joey Hess
Joey Hess wrote:
 Moving the git_test etc into perl code would be one way to speed it up
 for the common case. Adding a special case optimisation to avoid the shell
 for true and false brings mr list down from 8.50 to 1.81 seconds.
 The remaining time is here spent running skip tests, I have a lot. Probably
 looking at sub-1-second times for most people.

These optimisations are now in place.

joey@gnu:~/src/d-itime mr -q list 
1.14user 2.17system 0:05.12elapsed 64%CPU (0avgtext+0avgdata 26368maxresident)k 
0inputs+0outputs (0major+269034minor)pagefaults 0swaps
joey@gnu:~/src/d-itime ~/src/mr/mr -q list 
0.38user 0.02system 0:00.44elapsed 91%CPU (0avgtext+0avgdata 26640maxresident)k 
0inputs+0outputs (0major+6429minor)pagefaults 0swaps

joey@gnu:~time mr -q list 
1.67user 3.86system 0:08.75elapsed 63%CPU (0avgtext+0avgdata 26720maxresident)k 
0inputs+0outputs (0major+464487minor)pagefaults 0swaps
joey@gnu:~time ~/src/mr/mr -q list
0.56user 0.60system 0:01.78elapsed 65%CPU (0avgtext+0avgdata 26800maxresident)k 
0inputs+0outputs (0major+84959minor)pagefaults 0swaps

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home