Re: Adding more namespace support to git

2016-08-24 Thread Jeff King
On Sat, Aug 20, 2016 at 08:07:00PM +0100, Richard wrote:

> Because git is not namespace aware for anything but git-upload-pack
> and git-receive-pack, I've had to implement namespace parsing in cgit
> for listing branches, showing logs, displaying notes and commit
> decorations.  It might be more useful if this support was added to git
> itself, so other git servers could make use of it so there's less
> duplicated code.
> 
> I think the way to do this would be to make the low-level ref reading
> functions, read_raw_ref, for_each_reflog_ent*, reflog_exists etc.,
> interpret the ref they are passed as being relative to the current git
> namespace.

At GitHub, we store many forks for a single repository, and we
considered using namespaces for our storage strategy. But like you, we
ran into the problem that you they only work for certain operations. :)

Our solution is to use separate repositories, each with their own ref
storage, but pointing to a shared object store. It works, but there are
a lot of gotchas and performance issues with migrating objects around,
running repacks.

Michael Haggerty (cc'd) has picked up the pluggable ref-backend work
started by others, and I know has some ideas on doing namespaces at that
level. Basically, the concept of "namespaces" should be able to plug in
between the actual storage backend and the rest of the git code. Git
code wouldn't have to care whether the namespace plugin was in use, and
the namespace plugin wouldn't have to care which storage backend was in
use (it would just silently translate "refs/heads/foo" into
"refs/namespaces/123/heads/foo", and vice versa).

That's a very "complete" solution in the sense that the git code does
not know about the namespaces, and cannot even access refs outside of
it. But I think in general it would do what you want. Most operations
would run in a certain namespace (i.e., pretend nothing outside of that
namespace exists, for fetches, diffs, etc), and others would want to
look at the whole namespace (e.g., repacking, pruning). I don't know of
any operations that want to see both views in the same process.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Adding more namespace support to git

2016-08-22 Thread Richard
On 22 August 2016 at 20:16, Josh Triplett  wrote:
> On Mon, Aug 22, 2016 at 07:36:31PM +0100, Richard wrote:
>> On 21 Aug 2016 15:07, "Josh Triplett"  wrote:
>> > I'd like to see it work more automatically than that.  Perhaps a
>> > separate environment variable to set the client-side namespace?
>>
>> How about a config option? That could be set globally, per repository, in
>> the environment or on the command line.
>
> That might work, though you wouldn't normally want to set it globally or
> per-repository (since it affects access to a repository and you'd
> typically want to use multiple different values or it wouldn't have much
> point).

Globally is a bit contrived, but could be used to keep the top-level
namespace clean
so you might opt to default to fetching into a namespace called "main"
so that if you need to temporarily fetch into a different namespace it
wouldn't be problematic.
Perhaps it's a kernel tree from a vendor with a messy branch naming scheme
so you don't want to fetch it into your primary namespace and make it
difficult to find your branches,
but you don't know which of their branches you need until you've got them all.
So you fetch into the different namespace rather than a fresh clone
to avoid re-fetching everything (numerous alternative solutions exist)
Then once you've found out which branch you need,
you make a note, switch back to the "main" namespace and re-fetch just
that branch.

A per repository default namespace could also be useful
if an upstream repository has multiple namespaces (code vs
documentation maybe) you could fetch them all
and then switch between them when you need to work on different parts,
and if it's config rather than an environment variable it will persist
between shell sessions easier.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Adding more namespace support to git

2016-08-22 Thread Josh Triplett
On Mon, Aug 22, 2016 at 07:36:31PM +0100, Richard wrote:
> On 21 Aug 2016 15:07, "Josh Triplett"  wrote:
> > I'd like to see it work more automatically than that.  Perhaps a
> > separate environment variable to set the client-side namespace?
> 
> How about a config option? That could be set globally, per repository, in
> the environment or on the command line.

That might work, though you wouldn't normally want to set it globally or
per-repository (since it affects access to a repository and you'd
typically want to use multiple different values or it wouldn't have much
point).

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Adding more namespace support to git

2016-08-21 Thread Josh Triplett
On Sun, Aug 21, 2016 at 12:30:16PM +0100, Richard wrote:
> On 21 August 2016 at 03:05, Josh Triplett  wrote:
> > Unfortunately, I think at this point, GIT_NAMESPACE has to exclusively
> > refer to the namespace for the remote end, to avoid breakage.  Which
> > means any automatic pervasive support for namespaces on the local side
> > would need to use a different mechanism.  (In addition to applying to
> > ref enumeration, this would also need to apply to the local end of
> > refspecs.)  And this new mechanism would need to not affect the remote
> > end, to allow remapping the local end while accessing an un-namespaced
> > (or differently namespaced) remote.
> 
> The problem for hooks is that it is implicitly inherited,
> so it could work if upload-pack receive-pack and http-backend work
> with GIT_NAMESPACE set,
> but everything else that wants to use a namespace has to set
> --namespace on the command-line.

I'd like to see it work more automatically than that.  Perhaps a
separate environment variable to set the client-side namespace?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Adding more namespace support to git

2016-08-21 Thread Richard
On 21 August 2016 at 03:05, Josh Triplett  wrote:
> On Sat, Aug 20, 2016 at 08:07:00PM +0100, Richard wrote:
>> Since when upload-pack and receive-pack run hooks they leave GIT_NAMESPACE 
>> set
>> there are hook scripts that expect that the current namespace is ignored,
>> so commands that now want to be namespace aware would have to opt-in.
>
> That seems really unfortunate.  While at the time we wanted to start
> with namespace support in upload-pack and receive-pack (and
> http-backend) because those would allow using it as a server-side
> storage format, I don't think we realized that leaving GIT_NAMESPACE in
> the hook environment would completely prevent other git commands from
> automatically handling namespaces.
>
> And conversely, we can't just have upload-pack and receive-pack start
> removing it from the hook environment, because a hook might expect to
> read the current namespace from it (and then run git commands that the
> hook expects will ignore it).

This is exactly what I've had to do for my proof of concept
https://git.gitano.org.uk/cgit.git/commit/?h=richardmaw/namespaces=379124469a8a13208f976eb816375b00901ae77f

> For that matter, someone could run "GIT_NAMESPACE=foo git push
> remotename branchname" or
> "GIT_NAMESPACE=foo git clone remotename", and based on the current
> behavior, they'd expect to have the namespace apply to the remote end,
> but not the local end.

I'm fairly sure this isn't the case, at least from what I've tried.
At one point it appeared to be working,
but that was just because it started the upload-pack as a subprocess,
which inherited the GIT_NAMESPACE environment variable rather than
being passed it.
I think this is why the test suite always sets up a remote with the ext:: helper
so it can set --namespace=foo in the command.

This is one of the reasons why I have been working on namespace
support in the git server,
you have to encode the namespace in the url somehow
since it isn't passed through the git protocol.

We were thinking of adding ssh://git@server/~username/repo/path.git syntax
for letting users have their own private namespace in a repository,
and later extending the backend of the git server's repository storage
so that other repositories could just be namespaces of a different repository
so we could do something like repository forks
provided the repositories have the same availability.

> Unfortunately, I think at this point, GIT_NAMESPACE has to exclusively
> refer to the namespace for the remote end, to avoid breakage.  Which
> means any automatic pervasive support for namespaces on the local side
> would need to use a different mechanism.  (In addition to applying to
> ref enumeration, this would also need to apply to the local end of
> refspecs.)  And this new mechanism would need to not affect the remote
> end, to allow remapping the local end while accessing an un-namespaced
> (or differently namespaced) remote.

The problem for hooks is that it is implicitly inherited,
so it could work if upload-pack receive-pack and http-backend work
with GIT_NAMESPACE set,
but everything else that wants to use a namespace has to set
--namespace on the command-line.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Adding more namespace support to git

2016-08-20 Thread Josh Triplett
On Sat, Aug 20, 2016 at 08:07:00PM +0100, Richard wrote:
> I work on a git server called gitano.
> We've been using and recommending cgit for the web UI.
> 
> I've been working on adding git namespace support to both,
> so that we can separate administrative branches from code branches.
> 
> Because git is not namespace aware for anything but git-upload-pack
> and git-receive-pack,
> I've had to implement namespace parsing in cgit
> for listing branches, showing logs, displaying notes and commit decorations.
> It might be more useful if this support was added to git itself,
> so other git servers could make use of it so there's less duplicated code.
> 
> I think the way to do this would be to make the low-level ref reading 
> functions,
> read_raw_ref, for_each_reflog_ent*, reflog_exists etc.,
> interpret the ref they are passed as being relative to the current git
> namespace.
> 
> Since when upload-pack and receive-pack run hooks they leave GIT_NAMESPACE set
> there are hook scripts that expect that the current namespace is ignored,
> so commands that now want to be namespace aware would have to opt-in.

That seems really unfortunate.  While at the time we wanted to start
with namespace support in upload-pack and receive-pack (and
http-backend) because those would allow using it as a server-side
storage format, I don't think we realized that leaving GIT_NAMESPACE in
the hook environment would completely prevent other git commands from
automatically handling namespaces.

And conversely, we can't just have upload-pack and receive-pack start
removing it from the hook environment, because a hook might expect to
read the current namespace from it (and then run git commands that the
hook expects will ignore it).

(This also affects libgit2; I recently added a function to libgit2 to
interpret various git environment variables, including GIT_NAMESPACE.
If git commands can't just use that automatically, that'll need to
change too, to avoid unexpected behavior in hooks.)

For that matter, someone could run "GIT_NAMESPACE=foo git push
remotename branchname" or
"GIT_NAMESPACE=foo git clone remotename", and based on the current
behavior, they'd expect to have the namespace apply to the remote end,
but not the local end.

Unfortunately, I think at this point, GIT_NAMESPACE has to exclusively
refer to the namespace for the remote end, to avoid breakage.  Which
means any automatic pervasive support for namespaces on the local side
would need to use a different mechanism.  (In addition to applying to
ref enumeration, this would also need to apply to the local end of
refspecs.)  And this new mechanism would need to not affect the remote
end, to allow remapping the local end while accessing an un-namespaced
(or differently namespaced) remote.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html