etckeeper - keeping /etc in git

2007-11-13 Thread Joey Hess
I've written an etckeeper tool to help with storing /etc in git.
http://kitenet.net/~joey/code/etckeeper/
More details here:
http://kitenet.net/~joey/blog/entry/announcing_etckeeper/

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Git

2007-11-25 Thread Joey Hess
chombee wrote:
> So is anybody successfully versioning their homedir with git? And do
> they have any advice?

I've been using git for the lion's share of my home directory for a
while now.

> Some people on the git irc channel are very against this idea. They
> think you're crazy, will outright tell you not to do it, and even
> threaten to stop speaking to you if you continue doing it.
> Their reasoning seems to be that you're not using git for what it was
> intended for

Sadly, it's best to ignore this attitude, even if they are useful
contributors to git in general. Although if you really want to rile them
up, keep /etc in git too. :-)

> * Don't make the git repository directly in your homedir. You don't
> want git operating on 'live' config files, as this isn't what git was
> designed for, and who knows what might happen. If git dumps some merge
> conflicts into a config file while the file is in use, you could break
> your system.

This could happen with svn too. The worst case I can think of would be
if .profile has merge markers that the shell thought looked kinda like
redirection. I've never seen the merge markers truely break something
like that in several years of using svn for my home. The worse I see is
a web browser or music player failing to start because it can't parse
its config.

Anyway, there are plenty of places that merge markers could be inserted
into files in a linux kernel tree, or other thing that git is
semi-intended to be used for, with similar potential for bad results if
you type 'make' at the wrong point.

I'd draw the line at (non-fast-forward?) merging into /etc. I like my
passwd file. Anything else, I can recover from..

Of course if you feel safer copying stuff around, that's fine too, I
think it would be too annoying for me though.

> Also if you do this, then you have to abuse the gitignore file a lot to
> get git to ignore everything that you're not versioning.

That isn't abuse, it's a useful reminder of new dotfiles that you've
forgotten to check in.

> * Git doesn't track file metadata. If you care about this, I think you
> can script around it. Git supports executing custom scripts as
> pre-commit and post-checkout hooks. I think you could have a
> pre-commit script that reads file metadata and stores it somewhere
> inside the git repository, either in the content of the files
> themselves, or in a central metadata file. A post-checkout script
> could then read this stored data and reset it. Once they're setup you
> don't need to think about these scripts, just use git normally.

There's a nice package called metastore that does this, adding a
.metadata file that can be checked in to git. I use it for /etc, but I
don't currently use it for my home directory since I like the safety of
having a script I can run to fix up all permissions to files that
shouldn't be world readable, and since my home directory needs to be
checked out on many systems where I don't have root and can't trivially
install it. (It's annoying enough installing git on some of those.)


I'd recommend taking some care at how you split the different parts of
your home directory into different git repositories. One big repo is not
a good idea. I have .. 89 repos checked out total, although the core of
my home directory is really only 9:

[EMAIL PROTECTED]:~>mr -n 1 ls
mr list: /home/joey/# just enough to bootstrap the rest

mr list: /home/joey/.cron   # crontabs

mr list: /home/joey/.etc# dotfiles I need everywhere

mr list: /home/joey/.hide   # stuff for trusted machines only

mr list: /home/joey/.plus   # dotfiles/dirs that use lots of space,
# only for machines with reasonable disk

mr list: /home/joey/doc # private notes, etc

mr list: /home/joey/lib # various cruft..

mr list: /home/joey/mail# 1 gb of mail archives, going all the
# way back to 1995.

mr list: /home/joey/src # misc source code and a .mrconfig
# to check out 80-odd other source repos

mr list: finished (9 ok; 3 skipped)

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Git

2007-11-26 Thread Joey Hess
chombee wrote:
> Do you use the same setup as me, with a single centralised bare repo that 
> every other repo pushes to and pulls from?

Each repo has a bare repo somewhere that it pushes to, yes.

> It looks as if you have a number of git repositories inside your first git 
> repository. Do you just .gitignore them?

Yes.

> Also do you have something to tie together working with so many different 
> repos? It must be quite inconvenient to do status, add, rm, commit etc. on 
> 9 different repos all the time.

Yes, that's mr http://kitenet.net/~joey/code/mr/

> Any tips for versioning email with git? Does git work well for email?
> I thought it might perform badly on the two extremes that email
> presents: one big file (mbox) or one big folder with lots of small
> files in it (maildir). I guess you would always want to do 'git add .
> ; git commit -a;' to avoid having to manually add, rm and commit
> individual files. I think you might want to somehow auto-generate
> commit messages for email too.

It works ok for mailbox format archives, I have a cron job that adds and
commits the mail archives. Of course, there's the same problem as using
svn that the disk usage is approximatly doubled.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Keeping your dotfiles in git

2008-08-26 Thread Joey Hess
[EMAIL PROTECTED] wrote:
> Second, to create all the symlinks you only
> need a simple command not a script: `ln -s ~/dotfiles/* ~/`.

That won't deal with dotfiles that are renamed or deleted.
 
> I'm not entirely clear on why, in the examples I've seen, the -s option is
> used to create symbolic links instead of just using hard links.

I've had better luck with sylinks (various things can accidentially
break a hard link, in a non-obvious way), but do have to use hardlinks
for a few things. IIRC fetchmail refuses to use a symlink for example.

> Also, I fail to see the point in having a dotfiles directory. It might
> make it easy to break things up into generic dotfiles that apply to all
> machines that you use and go in the dotfiles directory, and dotfiles that
> are only used on particular machines and go in various
> dotfiles-machinename directories. But aside from that, why not just create
> the git repository directly in your homedir? Give it a .gitignore file
> that ignores everything (contains one line with just a *), then add to the
> repo only the files you want to version with commands like `git add -f
> ..muttrc`.

I have several dotfiles that I don't want to share to every machine with
a checkout of my home directory, so need multiple repositories.
 
-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: One Big Repo

2009-02-27 Thread Joey Hess
tchomby wrote:
> *   You are less likely to lose files. With many small repos, it becomes 
> almost 
> as easy to lose an entire repo as it was to lose a file before you started 
> versioning your homedir.

I have worried about this too. If you're making new small repos on a
daily basis, then it would be easy to forget to push one out of your
laptop, and lose it in one of the disasters laptops seem to make so
common.

Also, old repos that are no longer used, and that you even stop
checking out, become one server failure and backup oops away from being
lost forever.

> *   With one big repo git log gives you a global history of all your files, a 
> sort of log of what you've been doing on a day-to-day basis. This can be 
> really 
> handy. For example I have to meet with my supervisors every few weeks. 
> Instead 
> of using my memory I can just use git log to help me construct a progress 
> report.

Yeah, I sometimes wish I could make mr construct an interleaved log of
all the repos it runs on.

> All in all I don't understand why many small repos is the recommended 
> approach, 
> sounds like making something simple into something complex. What 
> disadvantages 
> does one big repo have?

I think that most of the disadvantages of using one big repo can be
ignored until you have to share (part of) that repo with others.
Note that wanting to check things out onto multiple machines
eventually will tend toward the same set of problems that sharing
the repo with others will present.

So, some of the specific problems include:

* Participating in typical free software development, which really
  demands one repo per project. Or working for an employer, who probably
  doesn't want their files in your personal repo.
* Needing to keep some set of files private (not letting others see
  them), and some other set *very* private (only on one or two machines).
* Wanting to check large data files into a repo, but not having space
  to put that repo on some machines.
* Having automated commits to some files (of achived mail, for example),
  and not wanting to see that in your general history, or deal with
  the merging/up-to-dateness issues it can entail.
* Wanting to host some files on one server (perhaps one that is
  well-connected to the world), and others on another (perhaps one
  at home, or at work).

I use a mixed approach:

* I have separate repos for files of well-defined types, like mail,
  sound files, personal docs, personal programs, and my web site.
  Basically, one for each top-level directory of my home directory.
* I have separate repos for each free software (or work) project I am
  involved with, and if I start a new project, I start a new repo for it.
  For me, this means only a few new repos each year, hopefully.
* I have a (over?)complicated set of several repos for my dotfiles, so
  that I can have one repo with a minimal set that doesn't take much
  space, another that adds in the larger stuff, and another that adds
  private dotfiles.

Occasionally, something will start out in one place and have to move to
another (ie, mr started out in my personal programs and moved to a
standalone package). But most of the time, there's one obvious place to
put any given file, with an existing repo that replicates it in a way
that's appropriate for that type of file.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: how to setup fresh machine from repository ala joey's home.git

2009-06-11 Thread Joey Hess
Jan Ptacek wrote:
> but I've run into a problem, cos I don't know how to git clone (mine)
> remote home.git
> into my home, cos git gives me
> fatal: destination directory already exists.
> (and I do not have root access on all machines)

Since git-clone is picky about files already existing in the directory
it clones into, I always do:

git clone git://git.kitenet.net/joey/home 
mv home/* home/.* .

> and what is more perplexing to me:
> Joey's own .mrconfig specifies following checkout cmd for his home:
> []
> order = 1
> checkout =
>   git clone ssh://j...@git.kitenet.net/srv/git/joey/home joey
> Is he really checking out his home into ~joey/joey ?

Well, that rule is really only for reference, I don't think I've ever
used it since I have to clone that to get the .mrconfig that contains
it.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: One Big Repo

2009-07-10 Thread Joey Hess
chombee wrote:
> On Fri, Feb 27, 2009 at 01:55:16PM -0500, Joey Hess wrote:
> > So, some of the specific problems include:
> > 
> > 
> > * Wanting to check large data files into a repo, but not having space
> >   to put that repo on some machines.
> 
> I think a good idea might be to have a special repo for big files 
> only. So you would have two general catch-all repos, one for really 
> big files and one just for small files. Right now I put every file 
> that doesn't belong somewhere else into one catch-all repo, whether 
> the file is big or small. But there's no reason why I shouldn't be 
> able to check out some text files and documents because I committed a 
> big bunch of PNG images.

I set this up myself recently. I have a git repo that I commit things
like every photo I suck off my camera, scans, and videos to. I think
of it as my raw data repository.

The bare repo is on my file server; my laptop clones it as follows:

git clone --shared /media/server/path/raw.git

This way the laptop does not have the overhead of the full .git repository,
it can just access that from the file server (nfs or sshfs). But
I can still commit locally and push to the server later.

If I commit a lot of big stuff and my local .git repo gets too big, this
dangerous command tries to ensure it's all been pushed to the file
server, and then cleans it out locally:

zap () {
if [ -e .git/objects/info/alternates ]; then
git push
rm -vf .git/objects/??/*
else
echo "not a --shared repo!"
fi
}

Only remaining problem is that checking really enormous files, such
as videos I am working on, into git makes git allocate memory for the whole
file. Needing to set up swap just to git commit a 700 mb dv file on my netbook
is a trifle annoying. :-P

I also use branches a lot in this repo, so that my netbook only keeps the
currently used files checked out. I figure that when this repo gets too big,
I'll just archive it off elsewhere, and start a new one.

> > * Having automated commits to some files (of achived mail, for example),
> >   and not wanting to see that in your general history, or deal with
> >   the merging/up-to-dateness issues it can entail.
> 
> Has anyone got this working (automated commit of archived mail)? 
> Currently I use offlineimap run by cron to sync my mail to a local 
> directory, then another cron job uses rsync to backup this directory, 
> just in case something goes wrong with the live copy. It'd be cool to 
> backup the mail directory by committing to a git repo.

Sure, I use the attached trimail script, which in turn uses archivemail
to move the read mail from the offlineimap maildirs into archival mailboxes,
and is run from cron nightly.

-- 
see shy jo
#!/bin/sh -e
# Archive old mail.

cd ~/mail/archive

# Move read mail that is older than 1 day old out of inbox folders and into
# archive.
for folder in `find ~/Maildir ~/Maildir/.* -maxdepth 0 -type d -not -name .. 
-not -name .Drafts`; do
dest=$(basename $folder | sed 's/^\.//')
if [ "$dest" = "" ] || [ "$dest" = "Maildir" ]; then
dest=inbox
fi
date=$(date +%Y-%m)
install -d $dest
if [ "$dest" = spam ] || [ "$dest" = virii ]; then
# Keep for 7 days, then delete.
archivemail -d7 --delete $folder
elif [ "$dest" = postmaster ]; then
# Keep for 1 day, then delete. While I'm getting flooded
# anyhow.
archivemail -d1 --delete $folder
else
archivemail -u -d2 -o $dest \
--archive-name=$date $folder
fi
done

for dir in `find -maxdepth 1 -mindepth 1 -type d -not -name .git`; do
# Compress mail not compressed by archivemail.
find $dir -maxdepth 1 -type f -regex '.*/[0-9]*-[0-9]*$' -exec gzip -9 
{} \;

# Either check old archives in, or delete them after a month.
if [ -n "$(git log -n 1 -- "$dir")" ]; then
git add `find $dir -maxdepth 1 -type f -regex 
'.*/[0-9]*-[0-9]*.gz'` 2>/dev/null || true
else
find $dir -maxdepth 1 -type f -mtime +31 -exec rm -f {} \;
fi
done
if ! git commit -q -a -m "autocommit" 2>/dev/null ; then
echo "git commit failed" >&2
exit 1
elif ! git push 2>/dev/null ; then
echo "git push failed" >&2
exit 1
fi
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: A git story

2010-03-11 Thread Joey Hess
This is a problem I'm increasingly stuggling with as I have more and
more git repositories, that were converted from old subversion
repositories etc. How do I make sure I don't accidentially delete a
repository that has historical data I will want later? How do I know
which historical repository to look in to find an old dead project?

Checking them all into git is appealing, but seems to have a recursion
problem. ;)

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Git merge disappointment

2010-04-26 Thread Joey Hess
Aristotle Pagaltzis wrote:
> It’s really the diff algorithm that’s getting fouled up.

> The solution is to use a better common marker than empty lines,

Patience diff is probably a better solution. While git diff supports it,
I don't know how to make it be used for a merge.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: mr update on secondary machines

2010-06-11 Thread Joey Hess
Abhishek Dasgupta wrote:
> I am using mr for updating my configuration repositories. Right now, the
> setup is quite simple, with my netbook as the primary computer, pushing
> to a git server at home. 
> 
> As I plan on using other machines with the configs as well, how can I
> configure so that git can pull via the ssh://git... address to my server
> in crontab? Is there any way ssh-agent can be accessed within crontab,
> or must these secondary machines pull via anonymous git:// URLs?
> 
> Anonymous updating is OK, but I won't be able to push from these
> machines. Does anyone have an idea about how to get around this problem?

Hmm, your subject tricked me, this is not mr-specific. Anyway..

I sometimes just use git:// and in the rare case I need to push from
such a checkout, will git push ssh:// directly and enter a password.

I sometimes set up ssh keys that are only allowed to run git-shell on
the server. This is accomplished as follows in its .authorized-keys:
command="perl -e 'exec qw(git-shell -c), 
$ENV{SSH_ORIGINAL_COMMAND}'",no-agent-forwarding,no-port-forwarding,no-X11-forwarding
 ssh-rsa ...

I have this in .ssh/config, so that a special key will be used when
sshing to git.* domains. This way the limited use key is not used when
normally sshing to the server.

Host git.*
IdentityFile ~/.ssh/id_rsa.git

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

using git to version large files w/o checking them in

2010-09-29 Thread Joey Hess
My pain point for checking files into git is around 64 mb, and it's
partly a disk-space based pain, so stuff like bup or git large file
patches don't help much. So despite having multi-gb git repos,
I still have no way to version ie, videos.

It occurs to me that I'd be reasonably happy with something that let me
manage the filenames (and possibly in some cases content checksums) of
large files without actually storing their content in .git/.
So I could delete files, move them around, rename them, and add new
ones, and commit the changes (plus take some other action to transfer
file contents) to propigate those actions to other checkouts.

Does anyone know of any tools in that space?

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: using git to version large files w/o checking them in

2010-09-30 Thread Joey Hess
hessi...@hessiess.com wrote:
> The reason for choosing subversion in this case is due to the fact that it
> *DOES NOT* store the history client side, if it did, I wouldn't even be
> able to check out due to a lack of space.

But it does still store a useless copy of each file under .svn. Unless
you've found a new svn WC implementation that avoids that wart.

> In my opinion, what is needed is a system that stores all history server
> side, does not store two copies of the data locally and has the option to
> manage files without versioning them.

I need something beyond that; I need it to be distributed as well (not
always around a large file server), and I need the ability to have
asymetric clones that hold different subsets of the files, that change
on request.

> Unison works well in limited circumstances, but its slow and fails if you
> are syncing more than two computers. One solution is to use git with the
> remote mount/ --shared hack, but it would be better if it could do this
> natively.

Yeah, unison doesn't work for me due to everything you said -- for files
that unison can handle reasonably, I find just checking them into git with
--shared on low-space clones, while painful, is less painful than unison.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

[announce] git-annex

2010-10-19 Thread Joey Hess
git-annex is a new thing I've been working on. It allows managing files
with git, without checking the file contents into git. Thus filling in
that missing chink in the git VCS-home armor, of what to do files too
big to be practical for git.

http://git-annex.branchable.com/

I'm doing a preview prerelease now. I have been trusting it with my data
for several days now. :)

To give an idea of one use case, I currently have git annexes on several
SATA drives that are offline archives. At my house, I have a
file server with a git annex on it. Here at the cabin I have a portable
USB drive hooked up to a NSLU2 being an ad-hoc file server; that's
another git annex. And my netbook has a git annex on its puny SSD too.
These are *all* clones of the same git repository, and file contents
can be transfered between them in any direction. And I can sit here
behind dialup and reogranize my tree, and with a single, low bandwidth
push, sync it out to all the (online) clones.

It doesn't quite stop there -- thanks to this list, I had a great suggestion
to make it use arbitrary key-value backends to store the content. As a
proof of concept, I made a simple URL backend, that just downloads file
contents from the web. It should be able to be hooked up to something like
Taho-LAFFS, or whatever.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: [announce] git-annex

2010-10-20 Thread Joey Hess
Yaroslav Halchenko wrote:
> lovely, finally I can get everything back
> 
> git annex get my_last_10_years_of_my_life_back.happiness
>   I was unable to access these remotes: girls
>   ...

hah! (what are their uuids? :P)

> oops
> 
> ;)  so, I followed the  walkthrough and isn't there "git commit" missing
> after 
> 
> git annex init 
> 
> invocations which leave .git-annex/uuid.log staged but not committed

There was -- but I've made it autocommit after git annex init now.

> I have just closed my eyes and proceeded forward but obviously I messed
> up -- whenever I switched to a newly created annex on a usb drive (also
> just staged uuid.log) where I have not added anything, it fails to 'git
> pull' since changes are staged for .git-annex/uuid.log which is to come
> from the 'home' now

Right, git will refuse to do a union merge if there are uncommitted
changes; once you commit them you can proceed with the pull and it will
auto-merge the logs in all cases.

The storage of the logs may eventually go to a separate branch and be
always committed automatically. There are some performance issues with
including all the logs in the main tree, and some annoyance factor,
especially if you try to switch branches in a repository. I need to
find a completly merge-free and efficient data structure for the logs
first, http://git-annex.branchable.com/bugs/branching/

Thanks for trying it out!

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: [announce] git-annex

2010-10-20 Thread Joey Hess
Yaroslav Halchenko wrote:
> does it also mean, that in the case of a separate branch, I could use
> 'annexed' repository alongside with a regular GIT repository at the same
> location which would contain the development tree with large files
> moved under annex control?

You do not need a separate branch or repository to do that. 
Just check in the files you want to revision control normally, and
git annex add the big files that are inappropriate for normal git use,
in the same branch of the same repository.

The main issue with branching is as follows: Suppose you create branch B
from master, and then switch back to master for a while. Now you create
a new clone of that repository on another drive, and some of your
annexed files end up there. In the master branch of the original
repository, git-annex knows about the new drive what files are there.
Now if you switch to branch B, it won't know about the new drive, so
won't know how to get some files. You'd have to merge master into B for
it to get an update to its worldview.

> why am I asking -- one of the problems in neuroimaging software
> development is the reliance on large datafiles necessary for various
> parts of the development or deployment process.  Inclusion of those into
> a source tree repository makes it really heavy (real story:
> recently converted 4.5GB CVS repository into a 3.5GB GIT one; it shrinks
> to 100MB if major data files get stripped down. project has been
> developing since early 90s).  So, I wondered if annexing could be a
> viable solution to inject those data files into a regular GIT repository
> allowing for optional 'getting' upon necessity, and without re-tailoring
> repository structure (i.e. where data files should go, since often they
> are spread around).

Yes, I think so. The key-value backend could be used to get those files
from a centralized place, too, rather than from other git-annex
repositories. For example, the URL backend could be used to pull them
from a web server on the intranet.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: [announce] git-annex

2010-10-20 Thread Joey Hess
Yaroslav Halchenko wrote:
> one issue I have discovered which seems to complicate such use case:
> git-annex seems to not track the content to any degree assuming
> that those data files are really immutable.  Although it is not
> common, our data files might change, and it seems there is no way in
> annex to track multiple "blobs" for the same filename?

This is up to the key-value backend used for the file. The default
backend is WORM (write once, read many). The unfinished SHA1 backed will
include the checksum in the key, so will track content -- but I'm unsure
yet how efficient it will be.

http://git-annex.branchable.com/backends/

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git-annex: dropping whole annexes

2010-10-20 Thread Joey Hess
Abhishek Dasgupta wrote:
> Is there any way to drop a whole annex at a go? One way is of course
> by going to the annex I want to drop and doing a 'git annex drop .'
> 
> However, consider the hypothetical scenario: I have just lost my
> pendrive which had an annex; since I can't go back to it I can't drop
> all the files from there. git annex would keep on thinking that my pen
> drive exists.
> 
> Would removing all instances of the pendrive uuid from
> .git-annex/*.log make git-annex forget?

Let's assume your pendrive was using the WORM or SHA1 backend. Then yes,
as long as its UUID was in the log files, git-annex might suggest you
connect it, when you tried to get a file that the logs indicate was on it.
It would be easy to filter its UUID out of the log files to avoid that.
(A line of perl -i -pe, or a half an hour of haskell type errors should
yield a command to do it. ;) 

Or, you could just edit .git-annex/uuid.log to say:

203122bc-dc94-11df-80b7-002170d25c55 my long lost and lamented pen drive :(


A related situation is if you just:

git annex add $foo  && git commit -m added
git rm $foo && git commit -m removed

Now there is a value in the backend that does not correspond to any
currently existing file in the annex. My plan has been to add some kind
of fsck command to check for those, and allow dropping them if desired.

Note that you might want to keep them -- for example, $foo might still
be present on another branch.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: [announce] git-annex

2010-10-27 Thread Joey Hess
I've now released git-annex. The changelog from the prerelease version
I posted about here before follows. Thanks for your interest!

  * Can scp annexed files from remote hosts, and check remote hosts for
file content when dropping files.
  * New move subcommand, that makes it easy to move file contents from
or to a remote.
  * New fromkey subcommand, for registering urls, etc.
  * git-annex init will now set up a pre-commit hook that fixes up symlinks
before they are committed, to ensure that moving symlinks around does not
break them.
  * More intelligent and fast staging of modified files; git add coalescing.
  * Add remote.annex-ignore git config setting to allow completly disabling
a given remote.
  * --from/--to can be used to control the remote repository that git-annex
uses.
  * --quiet can be used to avoid verbose output
  * New plumbing-level dropkey and addkey subcommands.
  * Lots of bug fixes.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git-annex on Windows

2011-03-15 Thread Joey Hess
Moritz Bartl wrote:
> Hi there,
> 
> I love the idea of git-annex. Can you give me a hand to get it to work
> on Windows?

Well, I can tell you that it assumes a POSIX system, both in available
utilities and system calls, So you'd need to use cygwin or something
like that. (Perhaps you already are for git, I think git also assumes a
POSIX system.) So you need a Haskell that can target that. What this
page refers to as "GHC-Cygwin":
http://www.haskell.org/ghc/docs/6.6/html/building/platforms.html
I don't know where to get one. Did find this:
http://copilotco.com/mail-archives/haskell-cafe.2007/msg00824.html

(There are probably also still some places where it assumes / as a path
separator, although I fixed some.)

FWIW, git-annex works fine on OS X and other fine proprietary unixen. ;P

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git-annex on Windows

2011-03-15 Thread Joey Hess
Alternatively, windows versions of these functions could be found,
which are all the ones that need POSIX, I think. A fair amount of this,
the stuff to do with signals and users, could be empty stubs in windows.
The file manipulation, particularly symlinks, would probably be the main
challenge.

addSignal
blockSignals
changeWorkingDirectory
createLink
createSymbolicLink
emptySignalSet
executeFile
fileMode
fileSize
forkProcess
getAnyProcessStatus
getEffectiveUserID
getEnvDefault
getFileStatus
getProcessID
getProcessStatus
getSignalMask
getSymbolicLinkStatus
getUserEntryForID
getUserEntryForName
groupWriteMode
homeDirectory
installHandler
intersectFileModes
isRegularFile
isSymbolicLink
modificationTime
otherWriteMode
ownerWriteMode
readSymbolicLink
setEnv
setFileMode
setSignalMask
sigCHLD
sigINT
unionFileModes

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: [announce] Sharebox, a FUSE filesystem relying on git-annex

2011-03-31 Thread Joey Hess
Christophe-Marie Duquesne wrote:
> I am currently writing a FUSE file system based on git-annex for
> replicating binary files on several machines. I thought I could share
> it here in order to get some ideas and contributors.

Wow, you have completely anticipated a blog post I was gonna make in a
few days that a) announces git-annex's support for using Amazon S3 as a git
"remote", and b) suggests that a free, distributed dropbox-type thing
could be built on this foundation.

My day, no, my week, is officially made. This is close enough to my
birthday that you are in the running for best birthday present. :)

> What are your goals?
> Seamless synchronization "à la dropbox".
> Ability to use with big binary files such as mp3/movies.
> Entirely decentralized.
> Don't use unnecessary space
> Keep it simple: avoid special VCS commands and keep a filesystem
> interface as much as possible.

100% agree with this list, although I think that explicitly not
mentioning what kind of large binary files a tool might be used to
store is a wise thing. ;)

> Why?
> Because sparkleshare and dvcs-autosync are bad at versioning binary files

I have not looked at sparkleshare, but have been wondering if it could
be adapted to be used as a GUI frontend for git annex.

> What do you have?
> A python implementation. It is about 600 sloc, and you'll find it on
> https://github.com/chmduquesne/sharebox
> Be careful, it is very alpha and it still does not have a proper
> conflict handler.
> 
> Hey, but copying is slow!
> On my machine, copying files to a sharebox fs is about 10 times slower
> than copying it on a normal fs. All the time is spent in python's
> os.write(): I guess the only way to work around this problem is to
> rewrite the whole thing in C, but I am keeping this for later.

I do wonder if a FUSE filesystem is really the best approach. Even a tight
C implementation will need to read/write entire file contents to put
them into the filesystem. Notice that git-annex avoids doing any copying
of large file content when adding a file (it even defaults to using a
backend that doesn't checksum, in order to preserve maximum speed).

I had been thinking more along the lines of an inotify daemon
that watches a directory (like dvcs-autosync), and drives git-annex.

One real benefit of a filesystem is that you can support
modififying the files, and proxy that through to git-annex as a delete
of the old object and an add of the new object. That certainly has vaue
-- do you do it?

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: mr feature request: default to ~/.mrconfig when there is no .mrconfig in $PWD

2011-04-02 Thread Joey Hess
Richard Hartmann wrote:
> the second request is that running mr anywhere without a .mrconfig in
> $PWD, I would love to have mr read ~/.mrconfig by default (or possibly
> $XDG_CONFIG_HOME/mr/config (no leading dot)). That would make running
> mr from anywhere a lot easier & convenient.

This is the default behavior from mr 1.00 on. In previous versions,
it can be enabled by the -p flag.

Note that you have to configure trust settings or mr will be very picky
about the contents of .mrconfig files it finds in random directories.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: mr feature request: support variables in .mrconfig

2011-04-02 Thread Joey Hess
Richard Hartmann wrote:
> I would love to have variables supported in addition to absolute and
> relative paths in .mrconfig. At the least, ~ should work and
> $XDG_CONFIG_HOME (or other generic ones) would be even better.

It's not documented, so perhaps it's a bug, but you can already do this. Ie:

[$HOME/foo]
checkout = git clone ...

[$XDG_CONFIG_HOME/bar]
checkout = git clone ...

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: [announce] Sharebox, a FUSE filesystem relying on git-annex

2011-04-02 Thread Joey Hess
Dieter Plaetinck wrote:
> @Joey: you mentioned you think inotify might be a better
> backend/paradigm for this than fuse, so do you think implementing
> git-annex in something like dvcs-autosync is feasible? and/or
> preferable?

Feasable? Certianly. Preferable? I'm in the "let a thousand flowers
bloom phase". It's spring. :)

As Christophe-Marie has pointed out, git-annex makes annexed files
semi-immutable, and FUSE can hide that quirk, while inotify watching cannot.
That could be confusing for certian users or use cases, if they are not
aware of what is going on. Or it could be something quickly learned
about how these special replicated directories work, that files have to
be copied to be changed.

This is also an area I hope to improve in git-annex, by using git smudge
filters. So it might get a mode where files can be modified and git
commit just annexes the new content. Last time I looked at this, git was
not *quite* there to let it be done efficiently.

> I quite like dvcs-autosync (partially because inotify is more simple
> than fuse, partially because it currently works already quite well) and I'm
> interested in making it support space efficient storage of big files;
> from what I've read it should be possible to do this with git-annex
> (which should not even change how we currently deal with small files,
> they would still be in git) but I'm still doing my first baby steps
> with git-annex so I wouldn't know. Advice very welcome..

All it probably needs at is simplest is something like this
(excuse the haskell):

toobig <- checkFileSize file
if toobig
then git_annex_add file
else git_add file
git_commit file

> Another note : files being tracked with git-annex through sharebox or
> dvcs-autosync or whatever should always have at least 1 "backup copy",
> so that if the file gets deleted everywhere, it still can be retrieved
> from somewhere (which raises the interesting question: where will you
> store this backup copy? introducing a node/repository which will hold
> backup copies can be considered going to a centralized model; which is
> something you (Christophe-Marie) try to explicitly avoid, but I think
> this is not necessarily a problem)

This is something git annex goes to large lengths to deal with.
It will enforce N backup copies; it tracks which other repositories
have which files; it can transfer wanted file contents from other
repositories in either a decentralized or a centralized manner; the
other repositories can be on other drives of the same computer, or
accessible by ssh, or even, now, Amazon S3.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: [announce] Sharebox, a FUSE filesystem relying on git-annex

2011-04-03 Thread Joey Hess
Dieter Plaetinck wrote:

> I think having support for this in git-annex would be very useful,
> even if it's not that efficient: if this can be dealt with in
> git-annex, individual "higherlevel" projects like sharebox and
> dvcs-autosync have less headaches.  Not to mention
> sharebox/dvcs-autosync would need to do really inefficient things to
> deal with it anyway. (because they can't involve themselves into the
> actual git/dvcs tricks, they work on a higher level of abstraction),
> and it might also benefit some users who work with git-annex manually.
> How do you see this? How hard/cumbersome is it to implement this in
> git-annex? Why is it inefficient?  It's not really clear to me after
> reading the smudge information on
> http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html

http://git-annex.branchable.com/todo/smudge/

> > if toobig
> > then git_annex_add file
> > else git_add file
> > git_commit file
> 
> unfortunately I don't think so:
> - with dvcs-autosync we often commit "early", as in, the file could still be 
> in the process of being written to, or it could be modified again after we 
> added it.
> From what I understand, we would need to forbid our users from changing the 
> file after it is added to git-annex, and worse: if git-annex does its "move 
> file, replace file with symlink" trick, while the user is writing to it, this 
> might break things.

You're right. However, you would also not want to commit many partial
versions of a large file as it was being written.

> - when a remote A pulls in the changes from remote B, for dropbox-like 
> behavior it should also automatically:
>  * run `git annex get`
>  * git commit .git-annex/*/*.log
> Does this seem about right?

Yes.

> - deletes will also need to propagate automatically (see next paragraph), 
> still need to figure out how to do that best.
> Note that dropbox-like behavior is different from the behavior you usually 
> expect from git-annex users.
> * usual git-annex behavior: every remote stands on it's own, there is no 
> forced "being in sync", so that deletes must happen as initiated by the user, 
> and this way you can prevent them from removing files if you expect it could 
> be the last instance of the file.
> * dropbox-like : remote A remove a file -> *all other remotes* should remove 
> the file, so that their "working copy" looks the same. BUT the file should 
> still be available *somewhere* so that a restore can be initiated (preferably 
> from any of these nodes)
> 
> I see two solutions here:
> - centralized: have 1 (or more) remotes that always keep a copy of the files 
> which are being removed on all other remotes, these would be backup-nodes, 
> they don't follow the strict "always in sync" rule that applies to the 
> regular nodes. (they follow the original git-annex idea more strictly)
> - decentralized: allow users to "remove files" by removing the symlink, but 
> still keep the blob in .git-annex on at least one of the nodes, so that it 
> can be restored from that.

Yes, that's the default behavior if the symlink is removed. There is
then a git annex unused pass that can be used to find and remove unused
content when space is needed. Given the size of modern drives, that
could be run nightly or something.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: [announce] Sharebox, a FUSE filesystem relying on git-annex

2011-04-03 Thread Joey Hess
Richard Hartmann wrote:
> I know Joey pondered this as well, you will find some references on
> git-annex' ikiwiki. This is needed for S3 in the medium term, anyway.
> 
> Basically, the plan is to encrypt the files with a symmetric key and
> then allow access to that key via other keys. That way, you can share
> some files between machines/people and still make sure no one gets at
> stuff they shouldn't.
> 
> The way to encrypt object files' names is still somewhat open to
> discussion, afaik.
> 
> 
> Classical dilemma: Where should this be discussed? On this list or
> within the ikiwiki? Maybe everyone interested should read through the
> ikiwiki and after some discussion here, we can dump use cases, design
> decisions etc back into ikiwiki as a TODO once Joey is happy with it?

I've put together my current thoughts at
http://git-annex.branchable.com/design/encryption/
Comments appreciated in any medium (except watercolors).

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: [git-annex] reflink=auto option not present

2011-04-20 Thread Joey Hess
Abhishek Dasgupta wrote:
> Hi,
> 
> I have been using git-annex for some time, and today while trying to
> issue `git annex get` it shows:
> get big_file (copying from host...) cp: unrecognized option '--reflink=auto'
> Try `cp --help' for more information.
> 
> The command was issued in a file server which has Ubuntu lucid installed
> and the version of coreutils shipped there does not have reflink=auto
> option. I guess that is why I did not see the problem on my Debian
> systems.
> 
> Is there any workaround?

Sure, just build git-annex from source on that system. Its configure
program should detect that.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: [git-annex] compile error on lucid (was: reflink=auto option not present)

2011-04-20 Thread Joey Hess
Abhishek Dasgupta wrote:
> Abhishek Dasgupta wrote:
> > Hi,
> > 
> > I have been using git-annex for some time, and today while trying to
> > issue `git annex get` it shows:
> > get big_file (copying from host...) cp: unrecognized option '--reflink=auto'
> > Try `cp --help' for more information.
> > 
> OK, it seems that git-annex checks for the reflink=auto option while
> building and disables that functionality if not present. However, now I
> have a different problem -- while compiling on lucid (version
> 0.20110401), I get this error:
> 
> [ 6 of 72] Compiling Key  ( Key.hs, Key.o )
> 
> Key.hs:77:7:
> No instance for (Arbitrary Char)
>   arising from a use of `arbitrary' at Key.hs:77:7-15
> Possible fix: add an instance declaration for (Arbitrary Char)
> In a stmt of a 'do' expression: n <- arbitrary
> In the expression:
> do { n <- arbitrary;
>  b <- elements ['A' .. 'Z'];
>return
>  $ Key
>  {keyName = n, keyBackendName = [b], keySize = Nothing,
>   keyMtime = Nothing} }
> In the definition of `arbitrary':
> arbitrary = do { n <- arbitrary;
>  b <- elements ['A' .. 'Z'];
>return
>  $ Key
>  {keyName = n, keyBackendName = [...], keySize = 
> Nothing,
>   keyMtime = Nothing} }
> 

This is either a too old, or possibly a too new version of the haskell
quickcheck library. 

One easy workaround is to edit the file in question and remove all the
"for quickcheck" stuff at the end of the file.
(The test suite will then not be able to build, but the program will.)

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: managing remotes collaboratively with mr

2011-04-29 Thread Joey Hess
micah anderson wrote:
> But what if someone adds a new remote? Because I put things in the
> .mrconfig as a 'post_checkout' the new remote will not be added to the
> git repository. I could add the remotes twice, in a post_checkout (for
> the new person who wants to get them all) and then also as a pre_update,
> but that seems a bit ugly.

[DEFAULT]
post_checkout = mr addremotes
pre_update = mr addremotes

[apache]
checkout = git clone gito...@labs.riseup.net:shared-apache apache
addremotes =
git remote add foo git://foo.ch/apache.git
git remote add bar git://bar.net/modules/apache
git remote add baz git://github.com/baz/apache.git

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Running mr commands on one type of repo, only?

2011-04-30 Thread Joey Hess
Richard Hartmann wrote:
> I would like to use mr to run `git gc` on all my git repos. While I
> could add a gc, or repack, option to my configs, I am wondering if
> there is a way for mr to only act on a certain type of repo.
> 
> The manpage seems to imply there no way to do this, but I figured
> asking wouldn't hurt :)

[DEFAULT]
git_gc = git gc "$@"

Now when you run "mr gc" in a svn repo, it will complain, but skip it.

joey@gnu:~/src/nslu2-utils>mr gc
mr gc: no defined action for svn repository /home/joey/src/nslu2-utils, skipping

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Running mr commands on one type of repo, only?

2011-04-30 Thread Joey Hess
Richard Hartmann wrote:
> Thanks, that works (mostly).
> 
> Two notes for the benefit of others on the list:
> 
> This does not work:
> 
>   include = cat ~/.config/mr/config.d/*
> 
>   [DEFAULT]
>   git_gc = git gc "$@"

Include outside a section is a documented special case that Works For Me.

> and mr is confused by (fake) bare repos, telling me it's an unknown repo type.

There is a git-fake-bare include file in mr that madduck developed.
Plain bare repos are however not supported.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: [mr] Setting command line parameters in a config file?

2011-04-30 Thread Joey Hess
Richard Hartmann wrote:
> is there any way to access command line parameters via the config? Setting
> 
>   quiet = 1
>   jobs = 5
> 
> in my main .mrconfig does not yield the desired effect.

This is not currently supported, but I'd probably accept a reasonable
clean patch.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: [mr] Setting command line parameters in a config file?

2011-05-01 Thread Joey Hess
Richard Hartmann wrote:
> If the attached qualifies, I can add support for verbose, quiet,
> stats, and interactive and update the docs.

Lot of code bloat. This would probably be a good oportunity to move the
global config variables into a hash, which could then be queried and
also filled in by loadconfig.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: managing remotes collaboratively with mr

2011-05-02 Thread Joey Hess
micah anderson wrote:
> > [DEFAULT]
> > post_checkout = mr addremotes
> > pre_update = mr addremotes

Of course these run on all repos, like any mr command.
So, what's really needed is:

[DEFAULT]
post_checkout = mr -d $MR_REPO addremotes
pre_update = mr -d $MR_REPO addremotes

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Converting a non-bare git-annex into a bare one

2011-06-12 Thread Joey Hess
Joey Hess wrote:
> Richard Hartmann wrote:
> > I made a pretty stupid mistake with a new repo... I created it
> > non-bare. Obviously, the first step to converting it into a bare one
> > is
> > 
> >   git clone --bare -l  
> > 
> > followed by moving the bits & pieces of git-annex into the right
> > places. As the directory structure of bare and non-bare git annex
> > repos seems to be different (.git-annex/ vs annex/ with different
> > content) and as I want to make _sure_ there won't be any problems down
> > the road, I am wondering if there is a guide or checklist to properly
> > converting between that two.
> 
> You can move .git-annex/ into annex/ in the bare repository. Also copy

Eh, I meant to say you can move .git/annex/ into annex/ in the bare
repository. The .git-annex directory is not used with a bare repository.


-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Converting a non-bare git-annex into a bare one

2011-06-12 Thread Joey Hess
Richard Hartmann wrote:
> I made a pretty stupid mistake with a new repo... I created it
> non-bare. Obviously, the first step to converting it into a bare one
> is
> 
>   git clone --bare -l  
> 
> followed by moving the bits & pieces of git-annex into the right
> places. As the directory structure of bare and non-bare git annex
> repos seems to be different (.git-annex/ vs annex/ with different
> content) and as I want to make _sure_ there won't be any problems down
> the road, I am wondering if there is a guide or checklist to properly
> converting between that two.

You can move .git-annex/ into annex/ in the bare repository. Also copy
over the annex.uuid setting into config, and that should be all that's
needed.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git annex fsck in bare repository

2011-06-13 Thread Joey Hess
Richard Hartmann wrote:
> Hi all (i.e. Joey),
> 
>   git annex fsck
> 
> is a no-op in a bare repository. While I can understand that there is
> no (easy) way to verify the symlinks, the annex objects are there
> regardless.
> Wouldn't it make sense to allow me to check repo integrity in bare
> repos, as well?

Yes, fsck could check the size and checksum (if available). It could not
check the location log correctness or number of available copies.

> As an aside, should those "smaller" issues go into the wiki or onto
> this list? Both is fine by me. Personally, I would lean towards "Keep
> small stuff on the list, save in wiki if there is a need".

I prefer to track such stuff on the wiki.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git_status/clustergit ... mr

2011-09-25 Thread Joey Hess
Michael Nagel wrote:
> I heard this is kind of a mailing list for mr. I hope this is correct.
> 
> Anyways, for a long time I have been looking for tools to operate on a
> collection of (git) repositories and ideally return aggregate results.
> Until now I used search engines to find such tools, and only found
> googles "repo" and Mike Pearce's "show_status". I have dubbed the
> latter "clustergit" and have been using it ever since.
> 
> Today a friend told me about Joey Hess' "mr", that seems to be able to
> do a lot of the things I need, but IMHO is comparatively difficult to
> set up and is not covered by many online tutorials -- which might in
> consequence lead to the low discoverability using search engines.

The name probably doesn't help. I don't see how mr is particularly hard
to set up; all it comes down to to add a repository is:

git clone git://foo/bar
cd bar
mr register

> What do you think of adding an "--all" switch to mr so I
> can invoke it like this: mr --all status and it would operate on all
> directories in the current or specified directory? For simple setups
> (like mine) this is all that's ever needed and I can always switch to
> a .mrconfig later. It would make the learning curve less steep and you
> could create some nice examples in a tutorial to demonstrate (some of
> the) capabilities of mr.

I think it would be more in tune with mr to have a way to go find
checked out repositories and register them all. That way you would
avoid the overhead of needing to search through possibly many
subdirectories each time, and you would have the start of a mrconfig
file, which is where most of the power of mr lies. (As you can use
it to make mr check out the same things on another computer, or
configure special commands to run in some repositories for things
like mr update, etc.)

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git-annex setup with bare remote and optionally bup?

2011-10-27 Thread Joey Hess
Richard Hartmann wrote:
> * git annex status does not know about the global annex keys & size

I think this could be fixed fairly easily using the existing code to
list the keys in a non-checked out git branch.

> One thing I have been pondering is to create a local clone of the bare
> repo and soft-link its .git/annex/objects to the bare's one. Is this
> sane or totally crazy?

It will defeat git-annex's location tracking so could lead to data loss.

> Another thing is that I would like this central bare repo to be a
> common backup point. I.e. it should incorporate bup seamlessly. From
> how I understand the docs, this is impossible with a bare git-annex
> repo; hopefully I am wrong.

I don't know why bup couldn't be used with a bare repository, but I am
unsure if trying to use bup in the same git repository as git-annex is
worth the potential complication.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git-annex setup with bare remote and optionally bup?

2011-10-27 Thread Joey Hess
Richard Hartmann wrote:
> On Thu, Oct 27, 2011 at 18:31, Joey Hess  wrote:
> 
> > I think this could be fixed fairly easily using the existing code to
> > list the keys in a non-checked out git branch.
> 
> Sounds good. Would that cover the other noted limitations, as well?

Unsure what you mean.

> 
> > It will defeat git-annex's location tracking so could lead to data loss.
> 
> Obviously, the non-bare repo would need to be untrusted. Assuming it's
> untrusted, is this save? Unless I can be _sure_ nothing will break, I
> am not sure if I want to try this just to see that I lost data.

It's still not safe. Consider A and B are symlinked and B is untrusted.
Now you run git annex drop in B. It checks that A has a copy of a file.. 
good, it does. So it deletes a file... from both.

> It would avoid having to have the data twice.

Oh, I thought you meant only storing the bup data in a separate branch
in the same repository. You can have a bup special remote and store your
data there but it then has to be accessed as a special remote, not as a
regular bare remote.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git-annex setup with bare remote and optionally bup?

2011-10-27 Thread Joey Hess
Richard Hartmann wrote:
> The problem is that, afaik, I can't have it as a bare special remote.

It would be very weird to have a bup repository that is *not* bare.

> The use case is that I built & hosted a server for backups and backups
> only. As origin, it's used to sync git state between all other repos
> and always keeps a copy of all files, forever. Obviously, I'd like to
> keep the bup repo in there, as well.

As I said, it's probably possible to use a branch of the same repository
for bup as for git-annex, but I'm not sure why it would be worth the
setup bother, compared with having a separate repository for bup.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git-annex setup with bare remote and optionally bup?

2011-10-27 Thread Joey Hess
Richard Hartmann wrote:
> On Thu, Oct 27, 2011 at 21:21, Joey Hess  wrote:
> 
> > It would be very weird to have a bup repository that is *not* bare.
> 
> True; what I meant was the merged bup & annex, indeed.
> 
> > As I said, it's probably possible to use a branch of the same repository
> > for bup as for git-annex, but I'm not sure why it would be worth the
> > setup bother, compared with having a separate repository for bup.
> 
> It doubles the amount of disk space used.

No, there's no reason to have git-annex send any files to the origin
repository if you're storing them on the bup special remote.

> I could have a remote bare repo as origin, but never copy any files to
> it. Another special remote for bup to store data in.
> But how to fsck this beast? A third, host-local, non-bare annex repo
> to run fsck from(we are talking hundreds of GB)? Or would an annex
> fsck from a different host run fsck on the bup host? Or can't I fsck
> bup remotes at all?

git-annex does not support fscking special remotes at all, content has
to be copied from them before it can be checked. However, bup
repositories should be able to be checked with regular git fsck.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Corrupt git-annex repository

2011-10-28 Thread Joey Hess
Richard Hartmann wrote:
> fatal: ambiguous argument 'git-annex..refs/remotes/origin/git-annex':
> unknown revision or path not in the working tree.

It seems your repository has lost the git-annex branch.

You might try running git fsck to get a better view of the damage,
but it's unlikely to fix anything.

Your best bet is to re-clone the repository from origin, and preserve
the git-annex file contents by moving .git/annex/objects from the broken
repository to the new clone. You can also move .git/config over, to
preserve the annex UUID setting (and any other configs of course). Then
run git annex fsck to make sure its location tracking for that
repository is accurate.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: how to refactor path duplication in .mrconfig section headers?

2011-10-28 Thread Joey Hess
Adam Spiers wrote:
> If I have multiple repository paths all similar but spread across
> different .mrconfig files, e.g.
> 
> in ~/.mrconfig
> 
> [.config/mr]
> checkout = ...
> 
> in ~/.config/mr/config.d/CLI:
> 
> [$HOME/.git-repos/zsh]
> ...
> 
> [$HOME/.git-repos/mutt]
> ...
> 
> and in ~/.config/mr/config.d/GUI:
> 
> [$HOME/.git-repos/urxvt]
> ...
> 
> [$HOME/.git-repos/fonts]
> ...

mr can look much nicer if you take advantage of locality and chaining.
By locality, I mean putting a mrconfig close to the directories it
checks out, rather than in some standards-body controlled directory like
~/.config.

For example, you could have:

~/.mrconfig:

[.git-repos/CLI]
chain = true

[.git-repos/GUI]
skip = test_non_gui_machine
chain = true

~/.git-repos/CLI/.mrconfig:

[zsh]
...

[mutt]
...

~/.git-repos/GUI/.mrconfig:

[urxvt]
...

[fonts]
...

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: various suggestions for mr

2011-10-28 Thread Joey Hess
Adam Spiers wrote:
> So far mr is clearly winning :-)  However, cfgctl does have one or two
> tricks up its sleeve:
> 
>   - Config modules / packages / repositories / whatever you want to
> call them are indexed by name within a unique namespace, rather
> than by directory path, and packages are grouped together into
> sections.  This allows you to easily run any of the actions on:
> 
>   - all the packages (just like running mr from $HOME)
> 
>   - a single package, just by specifying its name without needing
> to know where it lives, e.g. "cfgctl --update zsh" would
> update just the zsh repository
> 
>   - a section (i.e. group of packages) just by specifying its name
> (e.g. "CLI" or "mail" or "Xorg") without needing to know where
> anything lives, e.g. "cfgctl --pull Xorg" would update all
> repos containing config relating to my Xorg (previously X11)
> environment
> 
>   - any packages matching a regular expression e.g.
> "cfgctl --update /emacs/"

Having two namespaces for the same thing does not strike me as
necessarily a good idea. But if you wanted to do that with mr, you could
maybe take advantage of a little-known thing it does with determining the
absolute path:

joey@gnu:~>mkdir namespace
joey@gnu:~>cd namespace 
joey@gnu:~/namespace>ln -s ~/lib/sound
joey@gnu:~/namespace>ln -s ~/src/git-annex
joey@gnu:~/namespace>cd git-annex 
joey@gnu:~/namespace/git-annex>mr update
mr update: /home/joey/src/git-annex

The only problem with this approach is that it only work when inside the
symlinked directory, so mr update in ~/namespace won't update the
directories symlinked to there.

> All in all, I feel that mr has a better design than cfgctl, and has
> greater longevity.  So last night I spent an hour or two doing a quick
> proof of concept, to see whether I could extend mr to implement the
> functionality I require, in particular the integration with GNU stow.
> I'm pleased to say that so far it's looking very promising :-)
> This is pretty much all that's needed:

This seems close to something I could put in mr as an includable
library. Could use some documentation though.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Corrupt git-annex repository

2011-10-28 Thread Joey Hess
Richard Hartmann wrote:
> No, the branch was still there. If you want the contents, I can send
> them off-list.

Hmm, either the main git-annex branch or origin/git-annex seems to be
missing based on the error message, and I don't think it's the latter.

> Sounds like a good idea. One question about the UUID, though. I
> created a repo as replacement for a borked one and after running `git
> annex init foo`, it set a UUID for that repo. I copied over the old
> UUID, hoping to reduce clutter from unused repos, but git annex status
> still told me the local repo had the initial UUID.
> The above seems to imply that this should work.

The only place the uuid for a repository is stored is .git/config.

> Finally, I think there's still no way to declare a repo permanently
> gone, i.e. so that it will never show up in any status reports or
> similar ever again. Or did I just not find it?

There is not.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Corrupt git-annex repository

2011-10-28 Thread Joey Hess
Richard Hartmann wrote:
> On Fri, Oct 28, 2011 at 21:23, Joey Hess  wrote:
> 
> > Hmm, either the main git-annex branch or origin/git-annex seems to be
> > missing based on the error message, and I don't think it's the latter.
> 
> Both are there.

git log git-annex..origin/git-annex fails, so both are *not* there.

(git branch may show them, which means nothing when your repository is
corrupted)

> > The only place the uuid for a repository is stored is .git/config.
> 
> ef90acc8-019e-11e1-9354-9f042d197907 -- test

What I meant to say is that the only place that git-annex looks to
determine the UUID of the current repository is .git/config. Of course
it *stores* information about a UUID in various places -- in your
example you told it to store a description "test" for that UUID, so
status displays that even if it doesn't know of any repository that has
that UUID. Notice it did not say "test (here)"

> > There is not.
> 
> Why? If a UUID is marked as 'gone forever', the other repos could pick
> that information up over time and purge it from their own stores.

Because I don't know how to design a system that has auto-merging and
supports outright deletion of UUIDs. Consider this location log file:

1319837256.637657s 1 82e27eac-00d2-11e1-98af-a7c8649fdab4
1319835945.041525s 0 d819e6c8-01a7-11e1-af2b-9f1a8049ae44

If d819e6c8-01a7-11e1-af2b-9f1a8049ae44 is "gone forever" and is
removed, the log file will then contain:

1319837256.637657s 1 82e27eac-00d2-11e1-98af-a7c8649fdab4

But other repositories where that operation has not been performed will
still contain the first file. When one of them is union merged back
into it, the removed line will necessarily come back.

The best that could be done is to add a fourth trust state that is like
untrusted UUIDs but causes them to be hidden from view. But this would
mean additional work, to handle this edge case -- and simply changing
the description of a dead repository seems to work nearly as well.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: various suggestions for mr

2011-10-29 Thread Joey Hess
Adam Spiers wrote:
> I already did this; in fact I *had* to, in order to support GNU stow,
> which requires the stow package namespace to be the list of
> directories under a single "stow directory".  If you look for
> $STOW_PKG_PATH in the code I originally posted, you'll see:
> 
> STOW_DIR=$HOME/.cfg
> ...
> MR_NAME="`basename $MR_REPO`"
> ...
> STOW_PKG_PATH="$STOW_DIR/$MR_NAME"
> 
> and then post_{checkout,update} call the install() function which
> does:
> 
> ensure_symlink_exists "$STOW_PKG_PATH" "${MR_REPO%/}"
> 
> However, the basename operation does not preserve the uniqueness
> property which $MR_REPO had, and that's why I say that we need an
> additional namespace.

So pick an operation that does? tr / _ would do, for example.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Corrupt git-annex repository

2011-10-29 Thread Joey Hess
Richard Hartmann wrote:
> Is there any technical reason that would make
> 
> git annex init "test" --uuid=foo
> 
> impossible? That way, I could re-use the UUID when I _know_ it's OK to
> reuse them.

There is no technical reason that could not be done, but copying the
.git/config has the same effect today.

> As an aside you are using v1 UUIDs, I hugely prefer v4. Is
> there any way to change which are being generated?

It's up to the uuid program in PATH.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Corrupt git-annex repository

2011-10-29 Thread Joey Hess
Richard Hartmann wrote:
> On Sat, Oct 29, 2011 at 18:58, Joey Hess  wrote:
> 
> > There is no technical reason that could not be done, but copying the
> > .git/config has the same effect today.
> 
> OK, so git annex init, edit the UUID manually and then start to add
> data? That would still leave me with Yet One More repo in the repo
> list, defeating the initial purpose of re-using UUIDs.

copy .git/config before running git annex init.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: mr: chaining to absolute paths

2011-11-02 Thread Joey Hess
Adam Spiers wrote:
> I notice that chaining to absolute paths does not work, e.g.:
> 
> [$HOME/foo/bar]
> checkout = ...
> chain = true
> 
> This is due to the way the chaining code checks for an .mrconfig in
> the chained repository:
> 
> if ($parameter eq 'chain' &&
> length $dir && $section ne "DEFAULT" &&
> -e $dir.$section."/.mrconfig") {
> 
> Is this a feature or a bug?  I would have thought it would be useful
> to chain to absolute paths.

Probably because nobody noticed since when you're in ~/foo/bar,
~/foo/bar/.mrconfig will be read anyway without chaining. And there's
rarely a reason to use an absolute path.

Fixed in git.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: mr: chaining to absolute paths

2011-11-03 Thread Joey Hess
Adam Spiers wrote:
> On Wed, Nov 02, 2011 at 05:02:13PM -0400, Joey Hess wrote:
> > Adam Spiers wrote:
> > > I notice that chaining to absolute paths does not work, e.g.:
> > > Is this a feature or a bug?  I would have thought it would be useful
> > > to chain to absolute paths.
> > 
> > Probably because nobody noticed since when you're in ~/foo/bar,
> > ~/foo/bar/.mrconfig will be read anyway without chaining.

I probably meant to say ~/foo/.mrconfig fwiw.

> I'm beginning to suspect that the way I imagine using mr is
> fundamentally different to everyone else's way.  Your previous point
> about mr working best with locality of reference (i.e. each .mrconfig
> being in a parent or near ancestor of the directories containing the
> repos it manages) also contributed to this suspicion.  I can
> understand how that makes for clean .mrconfig files with short
> relative paths in the section headers, but I can't understand how you
> could then version control all your .mrconfig files and share them
> across computers.  And if you can't, then doesn't that discard a very
> large part of the advantage of using mr in the first place?
> 
> I guess it would really help me if one or two people would be kind
> enough to briefly describe the way they use mr, e.g.
> 
>   - How is your home directory structured, i.e. where do your mrconfig
> files and repos live within it, and which mrconfig files point to
> which repos?

Sure:

~
.mrconfig
doc
.mrconfig
(various document repositories)
src
.mrconfig
(many package sources)
d-i
.mrconfig
lib/backup
.mrconfig (only exists on a few machines, various repositories)


>   - How many mrconfig files and mr-managed repos do you have?

190 repos, mostly in src
 
>   - Do you track your mrconfig files with version control?

yes

>   - Do you frequently use the -d or -c options?

never

>   - Do you usually cd to a particular directory before running mr, and
> if so, why?

I always run mr in the directory I want to affect. Sometimes this
directory contains many repositories, sometimes only one. The point of
mr is I don't need to care how many underlying repositories there are.
If I run it in ~/src/d-i, I want to act on d-i; in
~/src/d-i/package/main-menu I'm only dealing with one package; in ~/src
I want to act on all my source repos.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: mr: chaining to absolute paths

2011-11-03 Thread Joey Hess
Adam Spiers wrote:
> > >   - Do you track your mrconfig files with version control?
> > 
> > yes
> 
> How do you do that?  Are they all in one repo?  How do you get each
> one into the right subdirectory of ~ ?

They're checked out by mr as part of the repositories that provide the
subdirectories they're in.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: mr: Lazy processing of repositories

2011-11-03 Thread Joey Hess
Svend Sorensen wrote:
> I'm using mr to manage the repositories for the various software that I
> track. However, I don't want to check out all the repos by default. (The
> list is getting long). I also don't want to make a modification to the
> .mrconfig each time I want to check out a repo. (E.g. have 'skip = true'
> on all repos, and remove the skip if I decide to check out a repo.)
> 
> I'm looking for a way to do lazy checkouts and updates. A lazy repo
> would not be checked out unless I run a command to tell mr to check out
> the repo. (Something like mr checkout foo.) After I force mr to check
> out a lazy repo, mr would act on the repo for future mr runs.
> 
> One way to do this is to have skip check for the existence of a
> different file for each repo. Creating the file would activate the repo.
> Any ideas for a better approach?

Good idea! In mr git, you can now use skip = lazy to get this behavior.

The lazy shell function is built into mr, but this shows how it works:

lazy() {
if [ "$MR_ACTION" = checkout ] || [ -d "$MR_REPO" ]; then
return 1
else
return 0
fi
}

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: mr: Lazy processing of repositories

2011-11-03 Thread Joey Hess
Svend Sorensen wrote:
> How do I force mr to checkout a lazy repo? 'mr checkout' seems to ignore any
> arguments, so 'mr checkout repo' skips repos that have 'skip = lazy'. If I
> manually create the repo directory, mr thinks the repo is already checked out.

Yes, this is a use case for mr checkout somehow taking a parameter, or
for the stuff that Adam has been talking about to let mr be told which
repos to act on.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: mr: Lazy processing of repositories

2011-11-03 Thread Joey Hess
Joey Hess wrote:
> Svend Sorensen wrote:
> > How do I force mr to checkout a lazy repo? 'mr checkout' seems to ignore any
> > arguments, so 'mr checkout repo' skips repos that have 'skip = lazy'. If I
> > manually create the repo directory, mr thinks the repo is already checked 
> > out.
> 
> Yes, this is a use case for mr checkout somehow taking a parameter, or
> for the stuff that Adam has been talking about to let mr be told which
> repos to act on.

Actually, this already works:

mr -d foo checkout

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: mr: chaining to absolute paths

2011-11-06 Thread Joey Hess
Adam Spiers wrote:
> Thanks for the info, but I'm confused because that doesn't seem to
> correspond exactly with the layout you gave earlier.  For example, you
> said that you have a ~/doc/.mrconfig, but you didn't say that there
> was a repository tracking ~/doc itself - only that ~/doc had various
> document repositories inside it.

Every directory I showed is a separate repository.

> So is your ~/doc/.mrconfig tracked by a repository?  If so it sounds
> like you have nested repositories, which git would presumably detect
> as submodules unless you are using a symlink manager?

git does not detect nested git repos as submodules, that would have to
be explicitly set up.

-- 
see shy jo
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home


Re: Doc on creating a remo repo without cloning

2011-11-25 Thread Joey Hess
Thomas Koch wrote:
> Hi,
> 
> I'm starting to learn git-annex and tried by creating one local git-annex 
> enabled repo on my laptop. Then I wanted to create another non-bare repo on 
> my 
> server to push to it.
> 
> I can not access my laptop from the server, since I'm sitting behind a NAT. 
> However there are at least two hurdles when I want to create another repo 
> without cloning:
> - I can not push to the remote repo without setting 
>   receive.denyCurrentBranch ignore

Push to a bare repository, and clone the bare repository.

> - When I do a simple git push, git tries to push the git-annex branch. 
> However 
> I understand that the git-annex branch is local to each repo and should not 
> be 
> shared? I get non-fast-forward complains on a push attempt.

No, the git-annex branch contains its state which must be shared between
all repositories, so it knows which repositories have which files.
You may need to run `git annex merge` before pushing this branch. Most
of the time, git-annex automatically does this merge.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: new mr patches on their way (not quite ready yet though)

2011-12-01 Thread Joey Hess
Adam Spiers wrote:
> I've made a lot of progress since this last mail, and I'm conscious
> that my branch is now approximately 50 commits ahead of the official
> master branch, so I think it's time to work on convergence if
> possible.

It would be helpful if you:

* separated the stow stuff into its own branch

  When adding a feature, I need something I can diff against;
  a chain of commits is ok, or a squashed commit is ok;
  a chain of commits mixed in with other unrelated changes is not

* wrote commit messages that were longer than one line
  I need to understand a commit before I can apply it, and the choice is
  either that you type a little but more, or that I spend significant
  time guessing and likely miss something.

  A commit such as a2515e7e89f35c8d3291da9a5908b42a8d0bb277
  is simple on its face, but is entirely lacking in an explanation
  of why the change was necessary; in what situation is there a dangling
  symlink and how did mr fail? Why is adding an error message
  of "BUG: this shouldn't happen" justified?

  Similarly, commit 655f0002ae80e21329ace97447a3a16c577949ec
  says it fixes "a small bug", but neglects to say what the bug is.

  A commit such as 49163f09b8ff2c70c64076040be772b8d37c84aa or
  1dd662640b946d681683c260f1b693cd0522b09f needs significant
  justification -- how does hardcoded VT100 color codes or a lot of
  different debug levels make mr better?

  Only a patch such as 96f2c8875bba4f7225decb60ee905815e2aeaa4a 
  doesn't need more than one line of explanation.

* avoid entangling different lines of development

  I was about to cherry-pick c4a8af985f525c2a1061576e72d526aa515151be
  until I noticed the last line ties it into the logging level stuff.

Here's a summary of the main features of my branch:
>   - A plug-in module for integration with the GNU Stow symlink
> manager, now well-tested and stable.  I have recently become
> co-maintainer of GNU Stow, and its current git master branch and
> next official release contain enhancements which make it
> compatible with mr out-of-the-box.  This makes tracking your $HOME
> dotfiles as easy as:
> 
>   [shell-config]
>   checkout = git clone ...
>   stowable = true

Is there really no way to detect that a given repository is "stowable"
by looking at the content of the repository or some other thing?
Why not?

>   - A plug-in module which allows tar balls / zip files etc. to be
> downloaded, unpacked, and used as repositories with minimal
> effort, e.g.
> 
>   [foo]
>   checkout = mr_download_checkout http://example.com/foo.tar.bz2

A weird feature IMHO, but I use lib/* is sort of a contrib area and I'm
willing to accept most any type of module into it.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: new mr patches on their way (not quite ready yet though)

2011-12-01 Thread Joey Hess
Adam Spiers wrote:
> Thanks for the reply.  Sure - I can syphon those commits off into a
> stow branch.  It bothered me too that they were non-contiguous.  In
> fact, I appreciate that just dumping a whole load of uncategorised
> commits on your doorstep isn't particularly helpful, so maybe it's
> better if I create a branch for each change, and then point you at
> each branch.  Would that work for you?

There's no need for a separate branch for self-contained changes.

> Regarding color: well, I'm not sure on what level the question is
> asked, so forgive me if I'm treating a more technical question as a
> philosophical one ;-) If it's whether the use of color makes the
> program any more useful, then I guess it's a matter of individual
> taste, and I'm aware that some people don't like it.  Each to their
> own, but personally I find that rather mystifying - after all, how
> many people do you know who use a web browser which renders everything
> in black and white?  Or even within the terminal / CLI environment,
> how many people dislike the colors generated by ls(1) or git enough to
> go to the trouble of disabling them?

I have not looked at the colors in use here, but IME adding color to
terminal programs is often badly done, resulting in an angry fruit salad
effect, and often needing additional configuration to disable it, or
to tweak the colors to ones visible for colorblind users or users who
cannot read yellow text on a white background. Which I can put up with
in a mail reader, git diff, or a text editor, but given the very small
amount of output done by mr, seems likely excessive. I suppose the idea
is to pick out errors from amoung the rather larger amount of output
displayed by the version control system, but mr already tries to
structure its output to make it easy to do that.

> Regarding debug levels: without a more fine-grained approach to
> logging, it would have been too hard for me to understand mr's
> internals and achieve all the development I was able to.  A large part
> of the challenge for me was understanding the order in which mr parsed
> / executed things etc. and the original -v was way too verbose to be
> able to trace this - it would have required me to keep too much stuff
> in my head at once.  In fact, there were a few minor bugs which I
> spotted precisely because I entered this debugging exercise.
> 
> Having said that, I freely admit that a debugging system based on
> numeric levels isn't particularly elegant or sophisticated, and I'd be
> happy to see something better.  But it's a very standard technique
> industry-wide (c.f. syslog and hundreds of other logging software
> projects which use the concept of "priorities" or "severities"), and
> it was also The Simplest Thing That Could Possibly Work.

I have never seen such a debugging system in which I did not use exactly
two modes: none, or "turn the dial to 11 and grep for the thing I
actually wanted".

> Because it's up to the end user whether he wants to stow the
> repository or not, and that decision is entirely dependent on the
> context in which the repository is being used.

But is there no way to tell if a repository is stowed? Why couldn't the
checkout just handle registering it with stow (or whatever), and then
the other actions could check to see if it was so registered.

> Wanting to use mr to manage the download and installation of a piece
> of software is weird?  Or the implementation is weird?

Wanting to use mr to manage software not in version control is weird. :)

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: new mr patches on their way (not quite ready yet though)

2011-12-01 Thread Joey Hess
Adam Spiers wrote:
> OK.  I'm not entirely sure I understand what you want though.  How
> would you define self-contained in this context?

Any patch that does not depend on any other patch.

> IMHO it's an impossible task without color.

Scanning for a new paragraph ("\n\n") is an impossible task?

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: new mr patches on their way (not quite ready yet though)

2011-12-02 Thread Joey Hess
Adam Spiers wrote:
> OK, so to summarise, would it be correct to say that you are prepared
> to review
> 
>   (a) a master branch only containing self-contained commits
> 
> and/or
> 
>   (b) topic branches which contain a series of non-self-contained
>   commits representing a single line of development?

Yes.

> I have mr acting on ~70 repositories.  Each one results in several
> hundred lines of output with -v5 (which is equivalent to your upstream
> '-v'), or a minimum of ~15 lines with -v4 (which hides 'lib' code).
> So we are talking about either hundreds or thousands of lines of
> output.  Therefore to debug certain things (usually issues with one of
> my plugins or .mrconfig files) I frequently have to pipe the output to
> less(1) and then use a combination of grepping and scrolling up/down
> by the page.

If you're trying to debug something, why are you running mr on more
than the one repository necessary to reproduce the problem?

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Removing git-annex Repo

2011-12-02 Thread Joey Hess
Klaus Ethgen wrote:
> is there a other way than "git checkout git-annex; sed -i -e
> '//d' **/*(.); git commit -a; git checkout master" to remove a
> repository completely from annex knowledge?

Well, that doesn't actually work; if you do that and then pull a
independantly changed git-annex branch from a remote, the
auto-union-merge will then add back all the lines you removed.

Marking the repository untrusted is the usual way.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Removing git-annex Repo

2011-12-02 Thread Joey Hess
Joey Hess wrote:
> Klaus Ethgen wrote:
> > is there a other way than "git checkout git-annex; sed -i -e
> > '//d' **/*(.); git commit -a; git checkout master" to remove a
> > repository completely from annex knowledge?
> 
> Well, that doesn't actually work; if you do that and then pull a
> independantly changed git-annex branch from a remote, the
> auto-union-merge will then add back all the lines you removed.
> 
> Marking the repository untrusted is the usual way.

With that said, the next release of git-annex will allow "git annex dead"
to be used to indicate a repo is gone. The info will still
be there in git, but it will avoid talking about this painful loss. :)

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Lost Repository Was: Removing git-annex Repo

2011-12-03 Thread Joey Hess
Klaus Ethgen wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
> 
> Am Fr den  2. Dez 2011 um 21:54 schrieb Klaus Ethgen:
> > Not only that, also a "git annex fsck" will bring it back. But I wonder
> > where it gets the description and ID of the old remote.
> 
> Now I have the same problem but only way around.
> 
> I did create a new git and annex. Filled it with annexed content and
> pushed all to a second new created repository.
> 
> Now I cloned the first to an other machine and did git annex init Blafoo
> inside. After push I was thinking that everything is ok. But it wasn't.
> Every time I fsck or add new files or otherwise do annex stuff it trows
> out the cloned repository from trust.log and uuid.log.
> 
> I use version 3.2022 from Debian at the moment.

I don't entirely understand your description of the problem.
I'd appreciate a proper bug report with full details; transcripts, etc.

> That leads me again to the question where annex do have its store what
> it like and what not?

http://git-annex.branchable.com/internals/

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git annex --help doesn't show manual page

2011-12-04 Thread Joey Hess
Adam Spiers wrote:
> $ git annex  --help
> No manual entry for git-annex
> 
> Is this issue related to the fact that I installed git-annex with cabal 
> install?

Yes, as far as I know, cabal does not have a way to handle man pages.
"make install" does install one, that git brings up when you run this
(git-annex never runs actually).

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git annex map does not spot graph loops

2011-12-04 Thread Joey Hess
Adam Spiers wrote:
> I set up two git annex repos on the local machine which point to each
> other and then run git annex map, it chews up a load of CPU,
> presumably trying to traverse the cyclic repository graph without ever
> noticing there's a loop:

Fixed, it only happened when the repos referred to each other with
relative paths.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: New integration branch

2011-12-04 Thread Joey Hess
Adam Spiers wrote:
> OK, as discussed, here's a new 'for-joey' branch based off
> current upstream master:
> 
>   https://github.com/aspiers/kitenet-mr/commits/for-joey
> 
> All commits are self-contained, and hopefully non-contentious.  They
> are mostly minor bugfixes, but there are a few enhancements too - most
> notably lib/stow and lib/download.

37617a63ec993b128f77a945a2020ec894c58eb1
loadconfig already uses %loaded to avoid reloading the same
config twice, so this extra check is not necessary, I think.

a61c1450ff4b108af26e899a89a1d8ff49cab86c
I picked the bugfix part.

The warning message on missing chain files exposes an unclear
thing in mr; it will try to chain to directories even when their
repository has skip = true, which causes the warning to show
up unexpectedly (ie, here). I think it needs to be changed to
honor skip = true even if chain = true.

3d6acc19e4d029657f72bbf7200a48b5438a643a
cherry-picked

2e350ca416572a37df3393d50c91a02b44d9137b
cherry-picked

b3b68137988e61be1a0f7d90caf05eabf7850f44
I developed a different fix this morning that shows correct
line numbers for both the mrconfig and the position in the
include, it's in my tree.

57386ef4bb07ebe1ea56c73c0fee86a51f417cda
cherry-picked

135e0076c9a93cd0556b9b25ff355ad25546a78c
This makes "mr fetch" do a git fetch, but nothing for
the over DVCSes which can also do things like fetch, and
no documentation of it

9c87f2352214175de307efedb8fd93889a26afbc
Can you give an example of when this is needed?

417617be05404662caf3ea893bca61674eb5dbe1
Already fixed in my tree.

d8d055572ca98ec92427265106ebf240990fa217
cherry-picked

602f26714254f3c65389b7665d15d1d5d0e227a9
mr is quite typically (I know, not by you) run
inside the repository. Which would leave the user
in an apparently empty directory after mr update if
an mr update deleted and remade the whole repository.

I don't like that; I don't think things in mr should be
deleting repositories in general; mr doesn't even delete
a repo that has deleted = true, it only warns the user about it.

650620d7b6661f9cc59b4adfb6a7d945240fe5c7
f16e51cea8595afc92f3ab9230e3c5a44baed904
I've held off on these plugins since I think they
depend on 602f26714254f3c65389b7665d15d1d5d0e227a9

cf3388f443b9d7afe6dc7d8a2159b45fb01ab4e4
This is a slow way to make machine-parsable info available -- 
the similar mr list takes 8 seconds here, since it has to run
169 shells. That's ok when you're just running mr, but I would
not like to use a command that depended on that information.

If a machine-parseable list of repositories is needed,
I think it'd be better to have a perl function that emits
it in one go.

(Also, the patch references a MR_NAME that is not present in my
mr tree.)

4cd2b59d0c66d71316dfc1d411a3e3da439643bc
I'm not quite sure of the point of this refactoring,
since the factored out download function has a lot
of bootstrap-specific stuff in it?

a64e990a37ceb5ce2b200645ebc0aabe67d3626e
cherry-picked

aa3caf53a9cb35ee3d0e4173ed44e964c6b8b5ab
cherry-picked. Nice feature!


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Lost Repository Was: Removing git-annex Repo

2011-12-04 Thread Joey Hess
Klaus Ethgen wrote:
> At the moment I am not absolutely sure of the bug or not.
> 
> Lets make an example.
> 
>> git annex status
...
>---- -- Clone

>> git annex drop file/name
>drop file/name ok

>> git annex status
> And no Clone repository anymore. But it is still a valid repository and
> in this case it ist one of the repositories that still has the dropped
> file!

It certianly looks like a bug. I cannot even think of how it would be
changing git-annex:uuid.log when all you did is a drop, and particularly
if there was no auto-merge of another git-annex branch.

Can you check out the git-annex branch and run git-log on uuid.log, 
and see what the most recent change to it looked like?

Do you have any clues how I can set up a repository that replicates this
behavior?

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Lost Repository Was: Removing git-annex Repo

2011-12-05 Thread Joey Hess
Klaus Ethgen wrote:
> > Can you check out the git-annex branch and run git-log on uuid.log, 
> > and see what the most recent change to it looked like?
> 
> It is an update. After that I revert this update and the next time it
> will purged again.

It sort of sounds as if you are checking out the git-annex branch,
manually editing and committing a file, and seeing git-annex revert
that change.

That's expected behavior actually..
http://git-annex.branchable.com/internals/ explains why. 

I don't know if this explains your whole problem.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: New integration branch

2011-12-05 Thread Joey Hess
Adam Spiers wrote:
> > 9c87f2352214175de307efedb8fd93889a26afbc
> >        Can you give an example of when this is needed?
> 
> I can't remember but I definitely saw it happen at least once :-/

My worry is that, since that really shouldn't happen AFIACS, you
were actually seeing a bug. Either that or it's a corner case I have not
identified.

> > 602f26714254f3c65389b7665d15d1d5d0e227a9
> >        mr is quite typically (I know, not by you) run
> >        inside the repository. Which would leave the user
> >        in an apparently empty directory after mr update if
> >        an mr update deleted and remade the whole repository.
> >
> >        I don't like that; I don't think things in mr should be
> >        deleting repositories in general; mr doesn't even delete
> >        a repo that has deleted = true, it only warns the user about it.
> 
> Hmm, that's a fair point, although the only alternative is to change
> the contents of the directory rather than the directory itself -
> similarly to how `git checkout' does, for instance.  I'll see if I can
> get around to doing that.  Perhaps some of the effort could be reused
> for implementing download_diff (diff against the archive file).

I think you could just use rsync :)

> > cf3388f443b9d7afe6dc7d8a2159b45fb01ab4e4
> >        This is a slow way to make machine-parsable info available --
> >        the similar mr list takes 8 seconds here, since it has to run
> >        169 shells. That's ok when you're just running mr, but I would
> >        not like to use a command that depended on that information.
> 
> Sure, that's why I used a simple on-disk cache:
> 
>   
> https://github.com/aspiers/kitenet-mr/commit/b60acb2e767b91ca6d406198d7eea1b3f73ad2bf
> 
> It works fine.  I could get more sophisticated and allow per-user
> configuration of the cache invalidation strategy, e.g. so that it
> would automatically rebuild the cache when ~/.mrconfig et al. are
> changed, but manual rebuilds aren't a great hardship.  In fact I could
> even rebuild the cache every time mr runs!
> 
> >        If a machine-parseable list of repositories is needed,
> >        I think it'd be better to have a perl function that emits
> >        it in one go.
> 
> I don't see how that's possible without ignoring the `skip',
> `deleted', and `include' parameters.

The include parameter is not a big problem, it's unlikely to require
more than one shell process, which will be relatively fast.

It's not clear to me what should be done about skip and deleted.
skip in particular can behave in weird ways, when something like
hours_since is used. To handle that all the skips would need to be
tested, which would be less work than "mr list" but still verging on
expensive. 

Depending on the application, it might be better to just dump all the
defined repositories including skipped and deleted ones; if the consumer
than runs mr in a skipped/deleted repo, mr will do something sane after
all.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: alternative mechanisms for including config

2011-12-05 Thread Joey Hess
Adam Spiers wrote:
> This may be a good time to discuss the design of the `include'
> parameter.  When you were deciding what its value should be, I guess
> there were at least three possibilities:
> 
> (1) a chunk of shell-code which returns the actual shell-code to
> include
> 
> (2) a chunk of shell-code which returns a list of names of files
> to include
> 
> (3) a delimited list of files to include
> 
> You went with (1).  One advantage of this is the ability to
> dynamically generate code to include.  But this could also be achieved
> with (2), by generating the files to include and then returning the
> names of the generated files.  Also, with (1), if the shell-code has
> an issue it's harder to debug because there's no containing file (and
> line number and surrounding lines) to refer to.  The main advantage of
> (3) is that you don't have to execute any shell code at all.  This
> would facilitate implementation of your suggestion of writing a Perl
> function to emit the repo list, although there's still the problem of
> the `skip' parameter, and I suspect too many users are already relying
> on the dynamic nature of `include' for (3) to be feasible.
> 
> But might it be worth implementing (2) alongside the existing (1), via
> a new `includefiles' special parameter?

I've made mr show the included line content in error messages now.
The speed hit of running that one shell command is minor.
It doesn't seem worth bothering users with deprecating the current
include, and needless complication to have a separate way with a list of
files.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: New integration branch

2011-12-05 Thread Joey Hess
Adam Spiers wrote:
> Skipper functions like hours_since could (and probably should) decide
> not to skip if MR_ACTION is set to a read-only action such "list" -
> arguably "diff" and "status" too, although that's a matter of personal
> taste.

It could, but skip = lazy is a harder case.

> But maybe we should step back a bit.  Currently we know that a full
> "mr list" is not particularly fast, but has anyone actually profiled
> it to find out where most of the time is being spent?  If we're only
> guessing then we might have it completely wrong ...

It could well not be. mr list -j 10 runs in the same time as mr list -j 1,
suggesting the overhead is in something else than actually running the
shell.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: New integration branch

2011-12-05 Thread Joey Hess
Joey Hess wrote:
> It could well not be. mr list -j 10 runs in the same time as mr list -j 1,
> suggesting the overhead is in something else than actually running the
> shell.

Whoops, bad benchmark, -j comes before action.

Anyway, yes, without any calls to system(), mr list takes just 0.35 seconds.
Those calls are:

169 mr list: running >>set -e;  # actual list command
118 mr skip: running vcs test >> # 
 55 mr list: running skip test >>set -e;
 50 mr deleted: running vcs test >>

(Note that the vcs test is split between skip and deleted, but
if those features are removed, the actual list command would
trigger the same test, so those don't add overhead.)

Moving the git_test etc into perl code would be one way to speed it up
for the common case. Adding a special case optimisation to avoid the shell
for "true" and "false" brings mr list down from 8.50 to 1.81 seconds.
The remaining time is here spent running skip tests, I have a lot. Probably
looking at sub-1-second times for most people.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: New integration branch

2011-12-05 Thread Joey Hess
Joey Hess wrote:
> Moving the git_test etc into perl code would be one way to speed it up
> for the common case. Adding a special case optimisation to avoid the shell
> for "true" and "false" brings mr list down from 8.50 to 1.81 seconds.
> The remaining time is here spent running skip tests, I have a lot. Probably
> looking at sub-1-second times for most people.

These optimisations are now in place.

joey@gnu:~/src/d-i>time mr -q list 
1.14user 2.17system 0:05.12elapsed 64%CPU (0avgtext+0avgdata 26368maxresident)k 
0inputs+0outputs (0major+269034minor)pagefaults 0swaps
joey@gnu:~/src/d-i>time ~/src/mr/mr -q list 
0.38user 0.02system 0:00.44elapsed 91%CPU (0avgtext+0avgdata 26640maxresident)k 
0inputs+0outputs (0major+6429minor)pagefaults 0swaps

joey@gnu:~>time mr -q list 
1.67user 3.86system 0:08.75elapsed 63%CPU (0avgtext+0avgdata 26720maxresident)k 
0inputs+0outputs (0major+464487minor)pagefaults 0swaps
joey@gnu:~>time ~/src/mr/mr -q list
0.56user 0.60system 0:01.78elapsed 65%CPU (0avgtext+0avgdata 26800maxresident)k 
0inputs+0outputs (0major+84959minor)pagefaults 0swaps

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Cabal won’t install git-annex, complains about Control.Monad.IO.Control

2011-12-10 Thread Joey Hess
Sean Whitton wrote:
> Currently reviewing my setup; switching from madduck’s vcsh to RichiH’s,
> and from unison to git-annex for my ~/var/ directory of massive media
> files and backups and the like.  Liking it so far.
> 
> Have installed ghs using my distribution’s package (CRUX, it’s
> source-based) and have installed cabal, but I get this error when
> running the installation command suggested on the git-annex wiki:
> 
> ,
> | Configuring git-annex-3.20111203...
> | Preprocessing executables for git-annex-3.20111203...
> | Building git-annex-3.20111203...
> | 
> | Annex.hs:25:8:
> | Could not find module `Control.Monad.IO.Control':
> | Use -v to see a list of the files searched for.
> | cabal: Error: some packages failed to install:
> | git-annex-3.20111203 failed during the building phase. The exception was:
> | ExitFailure 1
> `
> 
> Anyone know why this might be?  This is my first experience with
> Haskell.

I have a new-monad-control branch that fixes this; you can get it from
git for now.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Is there any interest in a patchset for mr to manage which repos are being handled?

2011-12-12 Thread Joey Hess
Adam Spiers wrote:
> Firstly, I built a library of skip functions:
> 
> https://github.com/aspiers/mr-config/blob/master/lib/skippers
> 
> which lets me write things like:
> 
> [$HOME/.GIT/adamspiers.org/gnupg.sec]
> skip = default_skipper || missing_exe gpg

I'm with you so far; this is how I use mr, so in a way it's how mr is
designed to be used.

> However, in the upstream mr, this is not fully implemented yet because
> it does not prevent checkouts of lazy repositories:
> 
> http://thread.gmane.org/gmane.comp.version-control.home-dir/396/focus=398
> 
> To solve this, I knew mr would need a mechanism for referring to a
> single repository, which in turn would require a new namespace for
> repositories.

This still seems a roundabout way to solve that problem.

Why not just:

lazy() {
if [ "$MR_ACTION" = checkout ]; then
if [ "$MR_FORCE" ]; then
return 1
else
echo "skipping checkout of lazy repo (set 
MR_FORCE=1 to enable)"
return 0
fi
elif [ -d "$MR_REPO" ]; then
return 1
else
return 0
fi
}

Then maybe make --force set MR_FORCE, and to enable one you just:

mr --directory somerepo --force checkout

> except that it's more direct, since if you enable 'foo', surely you
> would checkout 'foo' immediately after.  Then the only missing piece
> is 'disable'.  Personally I don't need this (yet, at least).  But if
> you really needed it, the lazy() skipper could easily be extended (or
> a new skipper written) to perform an extra check:
> 
> test -d .mrdisabled

rm -rf seems a good way to disable a lazy repo.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git-annex: can pull, but can’t push

2011-12-14 Thread Joey Hess
Sean Whitton wrote:
> On my second machine, my laptop, I don’t seem to be able to push to the
> centralised repository: I am getting the error one gets when one hasn’t
> yet done a pull and done a merge, but I definitely have:

> | ! [rejected]git-annex -> git-annex (non-fast-forward)

You probably need to run git annex merge. 

If you pull and then immediatly push, git-annex does not get a chance to
run, so its behavior of normally transparently handling merging of its
branch doesn't happen. So the sequence needs to be pull, git annex merge, push

I'd like to get a post-pull hook into git so git annex merge can run
automatically after pulls, but it's not there yet. (Git's post-merge hook
could be used but does not always get run after a pull.)

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: syncing non-git trees with git-annex

2011-12-14 Thread Joey Hess
Richard Hartmann wrote:
> I would use
> 
>   find -name \*.avi -exec git annex add {} \;

git annex add --not --exclude '*.avi'

The method described does have a problem that the end tree will look
like B, not a merger of A and B. I've posted a comment with details.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: --repos patch rebased against 1.07

2011-12-14 Thread Joey Hess
Adam Spiers wrote:
> Bearing in mind the below:
> 
> On Tue, Dec 13, 2011 at 11:58 AM, Adam Spiers  wrote:
> > However, even ignoring all these reasons, the requirement for a
> > namespace of short repository identifiers (which cannot contain the
> > '/' character) is an unavoidable consequence of supporting integration
> > between mr and GNU Stow, since Stow already requires it.  My --repos
> > patch automatically builds this namespace by taking the basenames of
> > each repository directory, and then carefully handling the minority of
> > cases where these are non-unique with minimal disruption to the end
> > user.
> 
> please could you cherry-pick this new version of the patch?

Most people will not use it. Some people will use it when they could
have simply used symlinks to point to the actual location of their
repositories to provide short names for them. It has the potential to
affect any current user of mr, who could be confronted with a warning
message about inability to slot a repo into this other namespace. It
adds the complexity of a separate namesepace, which feels strongly like
a bad idea.

You seem to be using the idea as your sole hammer to knock down various
nails that can be pulled in simpler ways, as seen in
.
Stow's need for a flat namespace also seems to have a simpler approach
of echo $MR_REPO | tr / _

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git-annex: Puzzling over bare repositories

2011-12-15 Thread Joey Hess
David Edmondson wrote:
> Set the default upstream:
> 
>   laptop$ git branch master --set-upstream origin/master
>   fatal: Not a valid object name: 'origin/master'.
> laptop$ 
> 
> This fatal error seems to be the source of the later problems.

I've never needed to use --set-upstream, and you shouldn't need it here.

>   laptop$ git push -v
>   Pushing to ssh://server//one/git/m
>   Warning: No xauth data; using fake authentication data for X11 
> forwarding.
>   To ssh://server//one/git/m
>! [rejected]git-annex -> git-annex (non-fast-forward)

You need to run git annex merge before pushing and all will be well.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git annex fills up repository with uuid logs

2011-12-17 Thread Joey Hess
Sean Whitton wrote:
> The only merge command I typed was “git annex merge”—surely that should
> do the right thing?

git-annex merge can't do this, but git merge git-annex certianly could..

> Is a revert, rebase or reset the best way to undo the damage I’ve done
> here?

Yes.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: inventory of files unavailable in local repository?

2011-12-23 Thread Joey Hess
Adam Spiers wrote:
> $ time git annex find --not --in= >/dev/null
> git annex find --not --in= > /dev/null  6.73s user 1.76s system 21%
> cpu 39.483 total

> Ouch!  Joey, is there an optimization that can be made for the local
> case here?

For --in=. , the optimisation of statting the files, rather than looking
up the location from git, is already done, if that's what you mean.
Without that optimisation, it's an order of magnitude slower. 

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: inventory of files unavailable in local repository?

2011-12-23 Thread Joey Hess
Adam Spiers wrote:
> Furthermore,
> 
>     git annex whereis --not --in=
> 
> lists all files, not just the ones which aren't locally available.

joey@gnu:~/lib/sound>git annex whereis --not --in=   
git-annex: no remote specified

If your mailer is eating trailing periods or something.. whereis --not --in=.
works as expected here, only displaying anything for non-present files.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: inventory of files unavailable in local repository?

2011-12-23 Thread Joey Hess
Adam Spiers wrote:
> I don't get that error message.

version?

> > whereis --not --in=.
> > works as expected here, only displaying anything for non-present files.
> 
> Ah, thanks - that works and is WAY faster.

So, your git-annex version was, apparently, seeing "--in=" as "in the
remote named ''", and doing the expensive query of git for that info..
or something. Hmm, I vaguely think I might have fixed a bug like that at
some point.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: inventory of files unavailable in local repository?

2011-12-23 Thread Joey Hess
Adam Spiers wrote:
> On Fri, Dec 23, 2011 at 5:13 PM, Joey Hess  wrote:
> > Adam Spiers wrote:
> >> I don't get that error message.
> >
> > version?
> 
> 3.20111211
> 
> > So, your git-annex version was, apparently, seeing "--in=" as "in the
> > remote named ''", and doing the expensive query of git for that info..
> > or something. Hmm, I vaguely think I might have fixed a bug like that at
> > some point.
> 
> Maybe not released yet?

Aha.. no, what's really happening is you have a remote with a
description of "", and so it thinks that is the one you mean. I have
added a special case to not match on such empty descriptions.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git annex add silently ignores non-existent files

2011-12-24 Thread Joey Hess
Adam Spiers wrote:
> git annex add this_file_does_not_exist
> 
> does not result in a warning.  This leads to confusing (lack of)
> behaviour in certain cases, e.g.
> 
> generate_a_list_of_files_some_of_which_contain_spaces | xargs git annex 
> add
> 
> would silently fail to add the files containing spaces (because you'd
> need the -d or -0 switch to xargs to cope with this case).

This can indeed be confusing; it's a result of git-annex using
git-ls-files --others.

joey@gnu:~/src/other/git>git ls-files --others zlib.o fofofofoofof
zlib.o

Since using git-ls-files is so convenient in most ways, all I can
think to do about this is document it.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git-annex: Puzzling over bare repositories

2011-12-26 Thread Joey Hess
Richard Hartmann wrote:
> On Thu, Dec 15, 2011 at 20:44, Richard Hartmann 
>  wrote:
> >> You need to run git annex merge before pushing and all will be well. 
> > This seems to be a _very_ common problem for new users. I know it's a
> > message from git, not git-annex, but would there be any way to display
> > a hint?
> 
> As a follow-up, there are no hooks that could be used. Pity.

To follow-up, I have added a new one, called tweak-fetch. Hopefully it
will be accepted into git in due course; I already have a tweak-fetch
branch of git-annex that can use the hook to avoid any need of manually
running `git annex merge`. The hook also allows for other interesting
possibilities..

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: [mr] bash-completion rules

2011-12-26 Thread Joey Hess
Adam Spiers wrote:
> there is a bigger cost - the risk of having a version of the completion
> rules which does not match the version of mr installed.

This is, in practice, not a large problem, and can be dealt with by
distribution integrators.

> There's also a converse argument.  Completion functions are not only
> coupled to the thing they are completing for, but also to the shell's
> completion API.  When the API changes, it's better to have completion
> functions within the shell's distribution, because the shell's
> developers can fix all completion functions to work with the new API
> in one go.

Which is why I would certianly not like to bundle zsh completion
functions with the programs they complete. You have to be a zsh guru to
write them, they have changed a *lot* over the years (I don't recognize
anything in the current dpkg completion that's left from the one I
originally wrote), and upstream is very responsive, to keep the completions
up-to-date.

I suspect that bash completion will head in a similar direction, as they
get reworked to support dynamically loading completions on demand, per
http://thread.gmane.org/gmane.comp.shells.bash.completion.devel/3375

With that said, putting a bash completion in mr now just means a little
probable pain later on, so I'm not strongly opposed. 

The real difficulty in completing mr is that it accepts an arbitrary set
of subcommands, even depending on what repository it's run in. In
practice, I just type abbreviated things like "mr up" and "mr p"
instead of reaching for the tab key; happily mr will accept any
nonambiguous abbreviations and can be taught others. :)

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git annex add silently ignores non-existent files

2011-12-27 Thread Joey Hess
Richard Hartmann wrote:
> On Sat, Dec 24, 2011 at 15:44, Joey Hess  wrote:
> 
> > Since using git-ls-files is so convenient in most ways, all I can
> > think to do about this is document it.
> 
> --error-unmatch ?

Produces a strange error message if run on files that are already in
git:

joey@gnu:~/src/other/git>git ls-files --error-unmatch --others zlib.o zlib.c
zlib.o
error: pathspec 'zlib.c' did not match any file(s) known to git.
Did you forget to 'git add'?

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git-annex: multiple web remotes?

2011-12-28 Thread Joey Hess
Olaf TNSB wrote:
> Hi joeyh and other annex-types,
> 
> I've got a couple of questions regarding the web special remote in
> git-annex.
> 
> Firstly, I was wondering if it was possible(*), or too technically
> difficult(!), to have multiple web remotes for files in an annex. For
> example, I can find a few different links for Graham's 'On Lisp' and I'd
> like to have my X local copies in annexes and also a couple of URLs for the
> book in my annex.

It does support storing multiple urls, and will try each until one
works. I just don't have much of an interface for adding/removing the urls.

If you're using a backend like SHA1, then if you simply run addurl on
both urls, and they have the same content, you'll get two annexed files
that share a key (and can then delete on of the files). git-annex does
record that both urls can be used to obtain the key. So that's one way.

Or, you can check out the git-annex branch and look in remote/web/ for the
files that contain the urls, as documented at 
http://git-annex.branchable.com/internals/ 

> Second question, & it's probably a RTFM, how do I find out what URL is
> associated with a file?  'git annex whereis file.name' only tells me 'web'
> as the annex location.

No interface currently, but see above.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git-annex: multiple web remotes?

2011-12-29 Thread Joey Hess
Olaf TNSB wrote:
> So I can manually edit 'remote/web' to change/remove urls?

Yes, although I recommend getting 3.20111211 before manually modifying
the git-annex branch.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Spaces in remotes and git annex

2012-01-05 Thread Joey Hess
Klaus Ethgen wrote:
> I seems to be predestined to find strange problems. ;-)
> 
> Today I cloned a remote git annexed repository that has a space in path.
> With git I have to use one backslash instead of three that are needed in
> scp. Now I tried to get a document from that repository and failed as
> git-annex seems to use scp and need all three backslashes.
> 
> Is there any way to work around that problem?

I've pushed a change that improves git-annex's support for weird urls
like this:

url = ssh://localhost/home/joey/tmp/foo bar
url = localhost:tmp/foo bar

Before, git-annex would seem to ignore the first of these (it actually
thought it was a local directory!), and complained the second was a
bad url, now both are supported and work in my tests.

Maybe this will fix your problem. If not, what kind of repository is it,
a git repository, or a special remote, and what does the url for it in
.git/config look like?

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git annex get performance issues with rsync

2012-01-18 Thread Joey Hess
Adam Spiers wrote:
> One of my USB drives just died, so I'm doing a 'git annex get --not
> --copies 1' to re-attain data redundancy.  It seems that a new rsync
> instance is invoked for each file?  In my case, I have thousands of
> photos which are big enough to be worth annexing but still not
> individually huge, so it seems that the overhead of each rsync
> invocation is significantly impacting throughput.  A quick empirical
> test showed in 20 seconds, that 'git annex get' managed to transfer 11
> photos, whereas a single (manual) rsync run transferred 33.  Is this
> easily fixable?

No, it's on the todo list but very far down it.

You can enable ssh's connection sharing though. (ControlMaster)

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: git annex get performance issues with rsync

2012-01-18 Thread Joey Hess
Adam Spiers wrote:
> OK.  You mean this?
> 
> http://git-annex.branchable.com/todo/parallel_possibilities/

More like this:

http://git-annex.branchable.com/todo/wishlist:_Prevent_repeated_password_prompts_for_one_command

> > You can enable ssh's connection sharing though. (ControlMaster)
> 
> The figures above were already with ControlMaster enabled.
> It helps, but the rsync invocation per file still hurts a lot.

Are you actually measuring a significant time used in starting rsync?

I think it more likely that time is spent recording location logs to the
git-annex branch. You also mentioned you were using --copies, which
requires looking up the location log for each file, even ones that would
not otherwise be processed.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Getting rid of lots of tiny annex’d files

2012-01-23 Thread Joey Hess
Sean Whitton wrote:
> Now that my history is full of loads of tiny symlinks and git-annex logs
> though, would it be better to start a fresh repository or will a git
> repack remove any performance issues?

There is a potential ongoing performance issue; since git-annex has kept
the log files in the git-annex branch. This makes for a larger
.git/annex/index file (which git rewrites rather inefficiently when
changes are made). The hashing used for the log files should avoid much
bloat to the tree objects, they will be somewhat bigger but not
excessively so.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: may be something like 3.0 (git-annex) source format?

2012-01-24 Thread Joey Hess
Yaroslav Halchenko wrote:
> I just thought to check with you on either the idea itself sounds any
> reasonable...
> 
> in the frame of NeuroDebian we package "data" packages which are usually
> quite large in size... there were numerous previous discussions on
> viable ways to gain some convenience/size tradeoff, e.g. how to not
> duplicate 1GB of data in .orig and .deb files.  But now that you
> equipped us with git-annex , it might be time to research for new ways?
> 
> just a blunt idea -- what if 3.0 (git) format (with which I haven't
> played before either) gets extended to provide git-annex'ed .git with
> references to publicly (and local for buildds) available annex-ed
> remotes and original urls (added via git annex addurl).  Then by
> dropping of all load from .git  within .orig we achieve small
> size, while providing ways to fetch the original load either from urls,
> or git-annex-ed remotes.
> 
> Does it sound any sensible?

I hope that git-annex may be useful to teams working on large data in a
VCS. I have only heard from a few projects doing this so far.

But, as a Debian source package goes, it's not sufficient for git-annex
to know a remote or a url that has the data. That wouldn't help if using
this source package on a desert island.

Now, maybe git-annex could be taught to use apt as a special remote, so
it could know that a given file in the source package can be obtained
by getting the corresponding binary package with apt and extracting the
file from it. While somewhat weird and recursive, there's no technical
reason that couldn't work. 

Whether the ftpmasters would accept such a source package, that instead
of some of the sources, contains essentially a note pointing at the
actual data in the binary package next to it; I don't know.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Strange mr/vcsh error on Debian Squeeze

2012-01-24 Thread Joey Hess
Sean Whitton wrote:
> Hi,
> 
> Checking out to a new Debian install, receiving this error when I try to
> update:
> 
> sh: perl:: not found
> sh: -d: not found
> mr update: unknown repository type and no defined update command for
> blahblah
> 
> Any ideas?

No, but mr -v might help.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Strange mr/vcsh error on Debian Squeeze

2012-01-24 Thread Joey Hess
Sean Whitton wrote:
> Usage: mr [options] action [params …]
> (Use mr help for man page.)
> 
> Is my version too old?

I mean adding -v to the failing command.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: Strange mr/vcsh error on Debian Squeeze

2012-01-25 Thread Joey Hess
Sean Whitton wrote:
> Thank you for your reply, and sorry for not thinking a little more before
> e-mailing.  Here is the output.  I think that this may be a vcsh problem
> rather than a mr problem after looking at this.

Looks to me like a mr version older than 1.09 being used with a mr
include file for vcsh that's written for that version.

-- 
see shy jo


signature.asc
Description: Digital signature
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

  1   2   >