Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-03 Thread Andreas Schwab
David Kastrup  writes:

> Are there some measures one can take/configure in the parent repository
> such that (named or all) additional directories inside of $GITDIR/refs
> would get cloned along with the rest?

$ git config --add remote.orgin.fetch '+refs/notes/*:refs/notes/*'

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread Jed Brown
John Keeping  writes:
> I actually wonder if you could do this with notes and git-grep; for
> example:
>
> git grep -l keeping.me.uk refs/notes/amlog |
> sed -e 's/.*://' -e 's!/!!g'
>
> That should be relatively efficient since you're only looking at the
> current notes tree.

I added notes handling to gitifyhg and would search it similar to this.
Since gitifyhg is two-way, I could not modify the commits.  Later, when
we converted several repositories (up to 50k commits/80 MB), I appended

  Hg-commit: $Hg_commit_hash

to all the commit messages.  This way it shows up on the web interface,
users don't have to obtain the notes specially, and "git log --grep"
works naturally.  I think it's worth considering this simple solution;
existing Git users won't mind recloning once.


pgp3uBjty3sk1.pgp
Description: PGP signature


Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread Jeff King
On Sun, Feb 02, 2014 at 11:37:39AM +0100, David Kastrup wrote:

> So I mused: refs/heads contains branches, refs/tags contains tags.  The
> respective information would likely easily enough be stored in refs/bzr
> and refs/bugs and in that manner would not pollute the "ordinary" tag
> and branch spaces, rendering "git tag" and/or "git branch" output mostly
> unusable.  I tested creating such a directory and entries and indeed
> references like bzr/39005 then worked.

Yes. The names "refs/tags" and "refs/heads" are special by convention,
and there is no reason you cannot have other hierarchies (and indeed, we
already have "refs/notes" and "refs/remotes" as common hierarchies).

> However, cloning from the repository did not copy those directories and
> references, so without modification, this scheme would not work for
> cloned repositories.

Correct. Anyone who wants them will have to ask for them manually, like:

  git config --add remote.origin.fetch '+refs/bzr/*:refs/bzr/*'

after which any "git fetch" will retrieve them.

> Are there some measures one can take/configure in the parent repository
> such that (named or all) additional directories inside of $GITDIR/refs
> would get cloned along with the rest?

No. It is up to the client to decide which parts of the ref namespace
they want to fetch. The server only advertises what it has, and the
client selects from that.


Others mentioned that refs were never really intended to scale to
one-per-commit. We serve some repositories with tens of thousands of
refs from GitHub, and it does work. On the backend, we even have some
repos in the hundreds of thousands (but these are not client facing).
Most of the pain points (like O(n^2) loops) have been ironed out, but
the two big ones are still:

  - server ref advertisement lists _all_ refs at the start of the
conversation. So, e.g.,

git fetch git://github.com/Homebrew/homebrew.git

sends 2MB of advertisement just so a client can find out "nope,
nothing to fetch".

  - the packed-refs storage is rather monolithic. Reading a value from
it currently requires parsing the whole file. Likewise, deleting a
ref requires rewriting the whole file.

So what you are proposing will work, but do note that there is a cost.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread John Keeping
On Sun, Feb 02, 2014 at 12:42:52PM +0100, David Kastrup wrote:
> John Keeping  writes:
> 
> > On Sun, Feb 02, 2014 at 12:19:43PM +0100, David Kastrup wrote:
> >> Duy Nguyen  writes:
> >> 
> >> > The file is for past commits only.
> >> 
> >> > New commits can contain these info in their messages.
> >> 
> >> If it's not forgotten.  Experience shows that things like issue numbers
> >> have a tendency to be omitted, and then they stay missing.
> >> 
> >> At any rate, this is exactly the kind of stuff that tags are useful for,
> >> except that using them for all that would render the "tag space"
> >> overcrowded.
> >
> > Actually, I would say this is exactly the sort of thing notes are for.
> >
> > git.git uses them to map commits back to mailing list discussions:
> 
> But that's the wrong direction.  What is needed in the Emacs case is
> mapping the Bazaar reference numbers (and bug numbers) to commits.

Ah, OK.  I hadn't quite read carefully enough.

I actually wonder if you could do this with notes and git-grep; for
example:

git grep -l keeping.me.uk refs/notes/amlog |
sed -e 's/.*://' -e 's!/!!g'

That should be relatively efficient since you're only looking at the
current notes tree.

> While it is true that the history rewriting approach would not deliver
> this either (short of git log --grep with suitable patterns), I was
> looking for something less of a crutch here.
> 
> > Notes aren't fetch by default, but it's not hard for those interested
> > to add a remote.*.fetch line to their config.
> 
> If we are talking about measures everybody has to actively take before
> getting access to functionality, this does not cross the convenience
> threshold making it a solution preferred over others.  But it's probably
> feasible to configure a fetch line doing this that will get cloned when
> first cloning a repository.

I'm assuming you'll need some form of tool (at least a script) to
manipulate this feature; it wouldn't be too hard for that to set this up
the first time it's run.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread David Kastrup
Duy Nguyen  writes:

> On Sun, Feb 2, 2014 at 6:19 PM, David Kastrup  wrote:
>> Since Git has a working facility for references that is catered to do
>> exactly this kind of mapping and already _does_, it seems like a
>> convenient path to explore.
>
> It will not scale. If you make those refs available for
> cloning/fetching, all of them will be advertised first thing when git
> starts negotiate. Imagine thousands of refs (and keep increasing) sent
> to the receiver at the beginning of every connection.

In current LilyPond repository:
git tag|wc
969 969   15161

In current Emacs mirror:
git tag|wc
   12021202   15729

In current Git repository:
git tag|wc
498 4984820

> Something like "reverse git-notes" may transfer more efficiently. Or
> we need to improve git protocol to handle massive refs better,
> something that's been discussed for a while without any outcome.

I think that even disregarding special use of references, _existing_
practice would already appear to warrant being able to deal with
thousands of refs in a reasonable manner.

It's a reasonable expectation to have a tag per (potentially
intermediate) release or release candidate.  For any project publishing
reproducible daily snapshots, the threshold of 1000 will get reached
within few years.

Of course, it is relevant information to know that right _now_
references will not scale.  But that does not seem like a defensible
long-term perspective.

-- 
David Kastrup
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread Duy Nguyen
On Sun, Feb 2, 2014 at 6:19 PM, David Kastrup  wrote:
> Since Git has a working facility for references that is catered to do
> exactly this kind of mapping and already _does_, it seems like a
> convenient path to explore.

It will not scale. If you make those refs available for
cloning/fetching, all of them will be advertised first thing when git
starts negotiate. Imagine thousands of refs (and keep increasing) sent
to the receiver at the beginning of every connection. Something like
"reverse git-notes" may transfer more efficiently. Or we need to
improve git protocol to handle massive refs better, something that's
been discussed for a while without any outcome.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread David Kastrup
John Keeping  writes:

> On Sun, Feb 02, 2014 at 12:19:43PM +0100, David Kastrup wrote:
>> Duy Nguyen  writes:
>> 
>> > The file is for past commits only.
>> 
>> > New commits can contain these info in their messages.
>> 
>> If it's not forgotten.  Experience shows that things like issue numbers
>> have a tendency to be omitted, and then they stay missing.
>> 
>> At any rate, this is exactly the kind of stuff that tags are useful for,
>> except that using them for all that would render the "tag space"
>> overcrowded.
>
> Actually, I would say this is exactly the sort of thing notes are for.
>
> git.git uses them to map commits back to mailing list discussions:

But that's the wrong direction.  What is needed in the Emacs case is
mapping the Bazaar reference numbers (and bug numbers) to commits.

While it is true that the history rewriting approach would not deliver
this either (short of git log --grep with suitable patterns), I was
looking for something less of a crutch here.

> Notes aren't fetch by default, but it's not hard for those interested
> to add a remote.*.fetch line to their config.

If we are talking about measures everybody has to actively take before
getting access to functionality, this does not cross the convenience
threshold making it a solution preferred over others.  But it's probably
feasible to configure a fetch line doing this that will get cloned when
first cloning a repository.  That's not too hot for people with existing
repositories, but since we are talking about a migration from Bazaar
anyway, Git users currently are so by choice and so might be more
willing to update their configuration if it helps with avoiding a fully
new clone.

-- 
David Kastrup
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread John Keeping
On Sun, Feb 02, 2014 at 12:19:43PM +0100, David Kastrup wrote:
> Duy Nguyen  writes:
> 
> > The file is for past commits only.
> 
> > New commits can contain these info in their messages.
> 
> If it's not forgotten.  Experience shows that things like issue numbers
> have a tendency to be omitted, and then they stay missing.
> 
> At any rate, this is exactly the kind of stuff that tags are useful for,
> except that using them for all that would render the "tag space"
> overcrowded.

Actually, I would say this is exactly the sort of thing notes are for.

git.git uses them to map commits back to mailing list discussions:

git fetch git://github.com/gitster/git +refs/notes/amlog:refs/notes/amlog &&
git log --notes=amlog

See also notes.displayRef in git-config(1).

Notes aren't fetch by default, but it's not hard for those interested to
add a remote.*.fetch line to their config.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread David Kastrup
Duy Nguyen  writes:

> On Sun, Feb 2, 2014 at 5:37 PM, David Kastrup  wrote:
>> in the context of an ongoing discussion on the Emacs developer list of
>> converting the Bzr repository of Emacs, one question (with different
>> approaches) is where to put the information regarding preexisting Bazaar
>> revision numbers and bug tracker ids: those are not present in the
>> current Git mirror.
>>
>> Putting them in the commit messages would require a full history
>> rewrite, and if some are missed in the process, this cannot be fixed
>> afterwards.
>
> What do you need them for?

Resolving references typically found in commit messages.  Also
establishing correlation to bug issue numbers.

> Perhaps putting everything in a file, maybe sorted by SHA-1, would
> suffice? It should not be too hard to write a script to map bug
> tracker id to a commit id.

We are not talking about "it should not be too hard".  We are talking
about "obvious and reliable enough to render a complete history rewrite
pointless".

> The file is for past commits only.

> New commits can contain these info in their messages.

If it's not forgotten.  Experience shows that things like issue numbers
have a tendency to be omitted, and then they stay missing.

At any rate, this is exactly the kind of stuff that tags are useful for,
except that using them for all that would render the "tag space"
overcrowded.

Rest assured that the "standard" answers have been beat to death in the
Emacs developer list thread several times over.

So I'm more interested in getting actual answers dealing with the
question I have asked rather than suggestions for questions that would
be easier to answer.

Since Git has a working facility for references that is catered to do
exactly this kind of mapping and already _does_, it seems like a
convenient path to explore.

It apparently even already works with --decorate:

commit c92b1fb3ad8514f08fc4cec531211717955a5c29 (tag: release/2.19.1-1, 
origin/release/unstable, tag: refs/bzr/r15000)
Author: Phil Holmes 
Date:   Sun Jan 19 15:01:48 2014 +

Release: update news.

-- 
David Kastrup
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread Duy Nguyen
On Sun, Feb 2, 2014 at 5:37 PM, David Kastrup  wrote:
> in the context of an ongoing discussion on the Emacs developer list of
> converting the Bzr repository of Emacs, one question (with different
> approaches) is where to put the information regarding preexisting Bazaar
> revision numbers and bug tracker ids: those are not present in the
> current Git mirror.
>
> Putting them in the commit messages would require a full history
> rewrite, and if some are missed in the process, this cannot be fixed
> afterwards.

What do you need them for? Perhaps putting everything in a file, maybe
sorted by SHA-1, would suffice? It should not be too hard to write a
script to map bug tracker id to a commit id. The file is for past
commits only. New commits can contain these info in their messages.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread David Kastrup

Hi,

in the context of an ongoing discussion on the Emacs developer list of
converting the Bzr repository of Emacs, one question (with different
approaches) is where to put the information regarding preexisting Bazaar
revision numbers and bug tracker ids: those are not present in the
current Git mirror.

Putting them in the commit messages would require a full history
rewrite, and if some are missed in the process, this cannot be fixed
afterwards.

So I mused: refs/heads contains branches, refs/tags contains tags.  The
respective information would likely easily enough be stored in refs/bzr
and refs/bugs and in that manner would not pollute the "ordinary" tag
and branch spaces, rendering "git tag" and/or "git branch" output mostly
unusable.  I tested creating such a directory and entries and indeed
references like bzr/39005 then worked.

However, cloning from the repository did not copy those directories and
references, so without modification, this scheme would not work for
cloned repositories.

Are there some measures one can take/configure in the parent repository
such that (named or all) additional directories inside of $GITDIR/refs
would get cloned along with the rest?

It would definitely open viable options for dealing with mirrors and/or
repository migrations in general.

-- 
David Kastrup

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html