Re: [PATCH v3 3/8] upload/receive-pack: allow hiding ref hierarchies

2013-02-06 Thread Jeff King
On Tue, Feb 05, 2013 at 07:45:01AM -0800, Junio C Hamano wrote:

  In the earlier review, I mentioned making this per-service, but I see
  that is not the case here. Do you have an argument against doing so?
 
 Perhaps then I misunderstood your intention.  By reminding me of the
 receive-pack side, I thought you were hinting to unify these two
 into one, which I did.  There is no argument against it.

What I meant was that there should be transfer.hiderefs, and an
individual {receive,uploadpack}.hiderefs, similar to the way we have
transfer.unpacklimit. That makes the easy case (hiding the refs
completely) easy, but leaves the flexibility for more.

Like this:

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index a8248d9..131c163 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -59,7 +59,7 @@ static int receive_pack_config(const char *var, const char 
*value, void *cb)
 
 static int receive_pack_config(const char *var, const char *value, void *cb)
 {
-   int status = parse_hide_refs_config(var, value, cb);
+   int status = parse_hide_refs_config(var, value, receive);
 
if (status)
return status;
diff --git a/refs.c b/refs.c
index e3574ca..9bfea58 100644
--- a/refs.c
+++ b/refs.c
@@ -2560,9 +2560,13 @@ int parse_hide_refs_config(const char *var, const char 
*value, void *unused)
 
 static struct string_list *hide_refs;
 
-int parse_hide_refs_config(const char *var, const char *value, void *unused)
+int parse_hide_refs_config(const char *var, const char *value, void *vsection)
 {
-   if (!strcmp(transfer.hiderefs, var)) {
+   const char *section = vsection;
+
+   if (!strcmp(transfer.hiderefs, var) ||
+   (!prefixcmp(var, section) 
+!strcmp(var + strlen(section), .hiderefs))) {
char *ref;
int len;
 
diff --git a/upload-pack.c b/upload-pack.c
index 37977e2..c0390af 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -794,7 +794,7 @@ static int upload_pack_config(const char *var, const char 
*value, void *unused)
 {
if (!strcmp(uploadpack.allowtipsha1inwant, var))
allow_tip_sha1_in_want = git_config_bool(var, value);
-   return parse_hide_refs_config(var, value, unused);
+   return parse_hide_refs_config(var, value, uploadpack);
 }
 
 int main(int argc, char **argv)


As an aside, I wonder if there is any point to the void pointer
parameter of parse_hide_refs_config. It is not used as a git_config
callback anywhere.

  And I
  have not seen complaints about the current system.
 
 Immediately after I added github to the set of places I push into,
 which I think is long before you joined GitHub, I noticed that _my_
 repository gets contaminated by second rate commits called pull
 requests, and I may even have complained, but most likely I didn't,
 as I could easily tell that, even though I know it is _not_ the only
 way, nor even the best way [*1*], to implement the GitHub's pull
 request workflow, I perfectly well understood that it would be the
 most expedient way for GitHub folks to implement this feature.
 
 I think you should take lack of complaints with a huge grain of
 salt.  It does not suggest much.

Sure, I do not pretend that nobody cares. But it is certainly not a
pressing issue, or there would probably be more outcry. And we must also
weigh it against the silent majority that are perfectly happy with the
status quo, that lets them fetch refs/pull/* as any other ref.

In your case, I really think the problem is less that you have a problem
with PR refs in the repository, and more that you do not care about the
pull request feature at all. To you it is useless noise, both in the
repo and on the web. Your arguments about provenance could apply equally
well to PRs accessible via the web interface.

I think the refs/ clutter is only an issue if you want to do mirroring,
and then you have an obvious conflict: did the fetcher want to mirror
everything, including refs/pull, or do they consider that to be clutter?
Only the client knows, which is why I think refspecs are the right place
to deal with clutter (the fact that we cannot say everything except
refs/pull/* is a weakness in our refspecs).

 *1* From the ownership point of view, objects that are only
 reachable from these refs/pull/* refs do *not* belong to the
 requestee, until the requestee chooses to accept the changes.
 
 A malicious requestor can fork your repository, add an objectionable
 blob to it, and throw a pull request at you.  GitHub shows that the
 blob now belongs to your repository, so the requestor turns around
 and file a DMCA takedown complaint against your repository.  A
 clueful judge would then agree with the complaint after running a
 clone --mirror and seeing the blob in your repository.  Oops?

I don't think this is a problem in practice. DMCA notices do not go to
the repository owner; they go to GitHub. And as far as I know, our
support staff deals with them on a case 

Re: [PATCH v3 3/8] upload/receive-pack: allow hiding ref hierarchies

2013-02-06 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 On Tue, Feb 05, 2013 at 07:45:01AM -0800, Junio C Hamano wrote:

  In the earlier review, I mentioned making this per-service, but I see
  that is not the case here. Do you have an argument against doing so?
 
 Perhaps then I misunderstood your intention.  By reminding me of the
 receive-pack side, I thought you were hinting to unify these two
 into one, which I did.  There is no argument against it.

 What I meant was that there should be transfer.hiderefs, and an
 individual {receive,uploadpack}.hiderefs, similar to the way we have
 transfer.unpacklimit.

Yes, as I said, I misunderstood your intention.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 3/8] upload/receive-pack: allow hiding ref hierarchies

2013-02-05 Thread Jeff King
On Wed, Jan 30, 2013 at 10:45:37AM -0800, Junio C Hamano wrote:

 Teach upload-pack and receive-pack to omit some refs from their
 initial advertisements by paying attention to the transfer.hiderefs
 multi-valued configuration variable.  Any ref that is under the
 hierarchies listed on the value of this variable is excluded from
 responses to requests made by ls-remote, fetch, clone, push,
 etc.
 
 A typical use case may be
 
   [transfer]
   hiderefs = refs/pull
 
 to hide the refs that are internally used by the hosting site and
 should not be exposed over the network.

In the earlier review, I mentioned making this per-service, but I see
that is not the case here. Do you have an argument against doing so?

I'm specifically thinking of the way we do refs/pull at GitHub (which we
hide only from receive-pack).  I know that you think it would be cleaner
to hide those, and at some level I agree. But at the same time, the
current mechanism has been in place for some time; changing what we
present via upload-pack is likely to break people's workflows. And I
have not seen complaints about the current system. So unless there is a
compelling reason to do so, I'd rather let the fetcher make the
decision.

Gerrit's refs/changes may be a different story, if they have a large
enough number of them to make upload-pack's ref advertisement
overwhelming.

I'm happy to do the per-service patch on top, but I just expected it
here, so I'm wondering if you are against having the feature.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 3/8] upload/receive-pack: allow hiding ref hierarchies

2013-02-05 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 On Wed, Jan 30, 2013 at 10:45:37AM -0800, Junio C Hamano wrote:

 Teach upload-pack and receive-pack to omit some refs from their
 initial advertisements by paying attention to the transfer.hiderefs
 multi-valued configuration variable.  Any ref that is under the
 hierarchies listed on the value of this variable is excluded from
 responses to requests made by ls-remote, fetch, clone, push,
 etc.
 
 A typical use case may be
 
  [transfer]
  hiderefs = refs/pull
 
 to hide the refs that are internally used by the hosting site and
 should not be exposed over the network.

 In the earlier review, I mentioned making this per-service, but I see
 that is not the case here. Do you have an argument against doing so?

Perhaps then I misunderstood your intention.  By reminding me of the
receive-pack side, I thought you were hinting to unify these two
into one, which I did.  There is no argument against it.

 And I
 have not seen complaints about the current system.

Immediately after I added github to the set of places I push into,
which I think is long before you joined GitHub, I noticed that _my_
repository gets contaminated by second rate commits called pull
requests, and I may even have complained, but most likely I didn't,
as I could easily tell that, even though I know it is _not_ the only
way, nor even the best way [*1*], to implement the GitHub's pull
request workflow, I perfectly well understood that it would be the
most expedient way for GitHub folks to implement this feature.

I think you should take lack of complaints with a huge grain of
salt.  It does not suggest much.

 Gerrit's refs/changes may be a different story, if they have a large
 enough number of them to make upload-pack's ref advertisement
 overwhelming.

This is probably a stale count, but platform/frameworks/base part of
AOSP has 3200+ refs; the corresponding repository internal to Google
has 60k+ refs (this is because there are many in-between states
recorded in the internal repository, even though the end result
published to the open source repository may be the same) and results
in ~4MB advertisement.  Which is fairly significant when all you are
interested in doing is an Am I up to date? poll.


[Footnote]

*1* From the ownership point of view, objects that are only
reachable from these refs/pull/* refs do *not* belong to the
requestee, until the requestee chooses to accept the changes.

A malicious requestor can fork your repository, add an objectionable
blob to it, and throw a pull request at you.  GitHub shows that the
blob now belongs to your repository, so the requestor turns around
and file a DMCA takedown complaint against your repository.  A
clueful judge would then agree with the complaint after running a
clone --mirror and seeing the blob in your repository.  Oops?

A funny thing is that you cannot push :refs/pull/1/head to remove
it anymore (I think in the early days, I took them out by doing this
a few times, but I may be misremembering), so you cannot make
yourself into compliance, even though you are not the offending
party.  Your repository is held responsible for whatever the rogue
requestor added.  That is not very nice, is it?

In an ideal world, I would have chosen to create a dedicated fork
managed by the hosting company (i.e. GitHub) for your repository
whose only purpose is to house these refs/pull/ refs (the hosting
site is ultimately who has to respond to DMCA notices anyway, and an
arrangement like this makes it clear who is reponsible for what).

The e-mail sent to you to let you know about outstanding pull
requests and the web UI could just point at that forked repository,
not your own (you also could choose to leave the outging pull
requests in the requestor's repository, but that is only OK if you
do not worry about (1) a requestor sending a pull request, then
updating the branch the pull request talks about later, to trick you
with bait-and-switch, or (2) a requestor sending a pull request,
thinks he is done with the topic and removes the repository).
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html