On 01/19/2013 01:37 AM, Junio C Hamano wrote:
> This is an early preview of reducing the network cost while talking
> with a repository with tons of refs, most of which are of use by
> very narrow audiences (e.g. refs under Gerrit's refs/changes/ are
> useful only for people who are interested in the changes under
> review).  As long as these narrow audiences have a way to learn the
> names of refs or objects pointed at by the refs out-of-band, it is
> not necessary to advertise these refs.
> On the server end, you tell upload-pack that some refs do not have
> to be advertised with the uploadPack.hiderefs multi-valued
> configuration variable:
>       [uploadPack]
>               hiderefs = refs/changes
> The changes necessary on the client side to allow fetching objects
> at the tip of a ref in hidden hierarchies are much more involved and
> not part of this early preview, but the end user UI is expected to
> be like these:
>       $ git fetch $there refs/changes/72/41672/1
>       $ git fetch $there 9598d59cdc098c5d9094d68024475e2430343182
> That is, you ask for a refname as usual even though it is not part
> of ls-remote response, or you ask for the commit object that is at
> the tip of whatever hidden ref you are interested in.

Although I can understand the pain of slow network performance, somehow
this proposal gives me the feeling of being expeditious rather than elegant.

Could the problem be solved in some other way?  Maybe such references
could be stored in a second repository or in a separate namespace (in
the sense of gitnamespaces(7)) to prevent their creating overhead when
they are unneeded?

And *if* reference hiding makes sense, it seems to me that the client,
not the server, should be the one who decides which server references it
is interested in (though I understand that would require a protocol
change).  Otherwise the git repository *relies* on out-of-band channels
for its functionality.  If I understand correctly, a user would have *no
way* to discover, via git, what hidden references are contained in a
remote repository, or indeed even that the repo contains a hidden
namespace.  For example this would make it impossible to clean up
obsolete "hidden" references on a remote repository without the
supplementary information stored elsewhere.  And if anybody accidentally
creates a reference in a hidden namespace by hand, it will just sit
there undetectably, forever.

I assume (though I've never checked) that a server does not let a client
ask for a SHA1 that is not currently reachable from a server-side
reference, and I assume that that you are not proposing to change this
policy.  But allowing objects to be fetched from a hidden reference
opens up some "interesting" possibilities:

* A pusher could upload arbitrary content to a public git server under a
cryptic hidden reference name.  Most people would be completely unable
to see this content, unless given the SHA1 or the reference name by the
pusher.  Thus this mechanism could be used as a dark channel to exchange
arbitrary data relatively secretly.

* Somebody could push a trojan version of code to a hidden reference in
a project, then pass the SHA1 to a victim.  The victim might trust the
code because it comes from a known project website, even though the code
would be invisible to other project developers and thus impossible for
them to audit.  And even if they learned about the trojan's SHA1 they
would be unable to remove it from their repository because they have no
way to find out the name of the hidden reference!

Obviously these hacks would only be possible for a bad guy with push
privileges to a repository that has turned on hidden references, but I
think they are sobering nevertheless.

These worries would go away if reference hiding were configured on the
client rather than on the server.

A second point: currently, the output of "git show-ref -d" and "git
ls-remote ." are almost identical.  Under your proposal, I believe that
the hiderefs would only be omitted from the latter.  Would it be useful
to add an option to "git show-ref" to make it omit the "hiderefs" refs?
 And maybe another option to make it display *only* the hideref refs?

And in the bikeshedding department, I wonder if "hiderefs" is the best
name for the config setting.  "hiderefs", implies to me that the refs
are actively hidden and not available to the client in any way.  But in
fact they are just not advertised; they can be fetched normally.  Maybe
another name would be more suggestive of its true effect, for example
"quietrefs" or "noadvertiserefs".


Michael Haggerty
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to