On May 5, 2009, at 6:42 AM, C. Titus Brown wrote:

> from what I can tell:
>
> github maintains a history of relationships between repositories.
>
> If I github-fork your pygr repository into a new 'pygr' repository  
> in my
> account, github records that fork and stores only the differences
> between your repo and mine.

Are you sure about this?  Can you point me to where you got this  
explanation from github?  This sounds very un-git-like to me.  After  
all, each git repository is supposed to be a complete, functional  
repository (that's the D in DVCS), and merely refers to other copies  
of the repository by defining "remotes" for push / pull purposes.  As  
far as I know, there is no way in git to make a "delta repository"  
that merely stores the diffs vs. a full repository stored somewhere  
else.  But of course, my knowledge of git only scratches the surface  
of everything it can do...

I thought github forks work as follows:

- each repository is a normal git repository, whether it is a fork or  
not.

- a fork differs from a non-fork repository in one minor way: it  
defines a remote called "upstream" that simply points to the parent  
repository.  That enables the user to get updates from the parent  
repository by just saying "git pull upstream <branchname>".  In all  
other respects, a fork repository is a full, normal git repository  
just like a non-fork repository.  If someone were to delete the parent  
repository, I don't believe everyone's fork repositories would turn  
into pumpkins.  I think the only effect would be that if a user of a  
fork repository tried to pull from the "upstream" remote, s/he'd get  
an error message saying the URL failed (because the parent repo no  
longer exists).  The user would still be able to push to origin (i.e.  
the fork repo stored on github) just fine, I think, and others would  
be able to clone / pull from it just fine too.

- outside the git repositories, github keeps some metadata about "who  
forked from who" to provide nice web features like pull-requests,  
network diagrams, etc.  As far as I know, this is just metadata, and  
quite independent of what's actually stored in the actual git  
repositories.

If you can point me to some information on github showing that this  
interpretation is wrong, please let me know...



>
>
> No matter what I rename my pygr repo to, github will only let me
> github-fork one copy of your pygr repo.

I think github is assuming two principles that make it unnecessary for  
one user to have multiple forks of one repo:

- git branches provide you with unlimited ability to create as many  
experimental versions and branches-of-branches ad nauseum.  Why fork  
when you can branch?

- github assumes a clean separation of "collaborators" into trusted  
vs. untrusted.  Ordinary collaboration will be exclusively by forking  
and pulling, i.e. each person has their own repository that they  
control, and changes only flow from one person to another only when  
the latter decides to pull someone else's changes into his personal  
repository.  That is a clean model of voluntary sharing with  
autonomy.  In special circumstances where you trust someone  
*completely*, you give them push privileges to your repository.  But  
if as you say you feel the need to keep a master repo that your  
collaborators cannot push to, then actually you *don't* trust them  
completely, and github would say that you should work with them using  
the normal fork-and-pull model.  Since github provides easy and  
convenient tools for working together in that way, that doesn't seem  
like a problem.

My impression is github does not want multiple people using one  
account; multiple forks of one repo in one account would seem to  
encourage that unwanted behavior.

>
> How will people know that cjlee112 owns the official github repo for
> pygr?  Will they have to look at the branching diagrams or read
> something on some Web page somewhere?  (Evidence suggests that neither
> is likely to happen ;)

First, let's emphasize this issue only arises for developers using  
git, not regular users.  The vast majority of people are just going to  
download an installer package.  Only developers who want the bleeding  
edge or want to contribute, and know how to use git, are going to face  
this question.

Developers will know which repo is master several ways:

- The Google Code source code link points to my cjlee112 github repo.

- Other github "pygr" repos will be forks of the cjlee112 pygr.git repo.

- we will tell them, consistently throughout our website materials etc.

But as you say, we can certainly maintain the repo.or.cz site as a  
replicate master (i.e. it will mirror the cjlee112 pygr.git github  
repo).

In my view the key question isn't "Is there a possibility of  
confusion?" but "Do the benefits of encouraging developers to create a  
fork and contribute changes back outweigh the potential  
disadvantages?"  Over the last year we've seen a lot of benefits of  
opening up Pygr development through the use of git and online  
collaboration tools.  That has been great.  I think github represents  
a logical next step for creating a community of developers  
contributing code back and forth.

>
>
> This is, I think, why Marek and I keep on suggesting alternatives like
> 'continue using git.or.cz for the master' or 'use a new pygr  
> account' or
> 'name it pygr-master'.  It's a clean and fairly obvious way of
> specifying which repo is the master.

But the consequences of keeping both "pygr" and "pygr-master" repos on  
github may be less than clean and obvious.  Say I create a repo called  
pygr and another called pygr-master as you recommended.  Now it's  
"clear" that pygr-master is the master repo, whereas  the pygr repo  
is... not the master?  Now other people say, "I want to fork from the  
master repo, not the lame non-master repo".  So they fork "pygr- 
master" and now we have a big group of people with "pygr-master"  
repos, and maybe another bunch of people with "pygr" repos.  When a  
new developer arrives and wants to join the fun, s/he has to figure  
out the following puzzles:

- which is the "real" pygr, "pygr" or "pygr-master"?  "I see 7 people  
with pygr repos and 9 people with pygr-master repos.  Some of them,  
like that cjlee112 guy, have both.  And the tools that work within one  
of these groups, (e.g. pull requests and network diagrams) will not  
work for collaborating with the other group, because there is *no*  
connection between them.  Puzzling."

- after picking between those two groups, s/he still faces the  
question you raised -- "who should I fork from, cjlee112, ctb, istvan,  
etc.?"  No problem has been solved by this "solution"...

>
>
> Also, if our experience so far is any guide, you'll end up with a  
> number
> of personal branches as well as "official" branches (0.7, 0.8, etc.)  
> in
> the cjlee112 pygr repo on github, which could also be confusing to
> people looking to get the latest.

I think we're over-thinking this.  After all, this is only for git- 
savvy developers, not ordinary users.  Git universally follows the  
naming convention that "master" is the primary branch to keep in synch  
(i.e. what you called "the latest"), and anyone familiar with git will  
know that.  Your question, as I read it, is just a restatement of the  
fact that git has a learning curve because it offers so much more  
power, and that git repositories customarily get used in more  
sophisticated ways than the average CVS / SVN repo did.  I think the  
potential confusion for new developers is outweighed by the tools  
github offers for making it so darn easy for multiple people to work  
together on any branch they care to name.

-- Chris


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pygr-dev" group.
To post to this group, send email to pygr-dev@googlegroups.com
To unsubscribe from this group, send email to 
pygr-dev+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/pygr-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to