Hi Rob, Josh and Lluis, On Thu, Sep 30, 2010 at 9:41 AM, Rob Speer <[email protected]> wrote: > The fact that I wasn't around for the sprint probably has a lot to do > with how much the code had diverged. But it's not too bad -- I merged > Fernando's branch into mine and only had to change a couple of things > to make the tests pass. > > There seem to be two general patterns for decentralized projects on > GitHub: either you have one de facto leader who owns what everyone > considers the main branch (this is what datarray is doing now, with > Fernando as the leader), or you create a GitHub "organization" that > owns the main branch and make a bunch of key people members of the > organization (which is what numpy is doing). > > The way you'd usually get something merged in this kind of project is > to send a pull request to the leader using the "Pull Request" button. > But in this case, I'm basically making my pull request on the mailing > list, because it's not straightforward enough for a simple pull > request.
sorry it took a bit longer than originally planned to merge this. Some of us got together on Wednesday on campus and worked through a lot of this code (we closed one pull request by Lluis and replied to the other requesting more work, as it broke many tests). We didn't announce an IRC sprint because last time it just proved a bit hard to be simultaneously productive on-premises and keep a good flow of activity on IRC for remote contributors, sorry. Perhaps with more people available that will be possible later on, but for now it seemed wiser for us just to push through with what little work we could do. **Merged changes** Here's a summary of the things we did merge in from your pull request: - merged gitignore, thanks. - your graft change to manifest.in wasn't needed, instead the bug was that our setup file was missing properly listing the testing package, and that's the right solution to apply: PACKAGES = ["datarray", "datarray/tests", "datarray/testing"] Thanks for catching that problem! - merge readme improvements into readme.txt. Note that since this is a pure python project, markdown-formatted readme files are fairly out of place (they don't render correctly on pypi, for one thing, and our readme is the same as our long_description field in the setup.py). So the top-level readme must remain a reST file. But we did incorporate your text, thanks! As a future note when editing text files, please keep text lines to 80 characters just like code ones. Diff (and github) are mostly line-oriented, so it's best to format text files with hard linebreaks (even if many editors can handle soft linewraps correctly, it's just not very portable). - merged your fancy printing support, excellent work! We did adjust the tests a little bit so they would pass without named/attribute support, since that will require more discussion (see below). But the main file is in unmodified, and the test changes are tiny. We made a mini-release v0.0.5 now, with these changes in: http://fperez.github.com/datarray-doc/0.0.5/index.html **Workflow** For future reference, while we were able to manually work through your pull request, I'd like to suggest that you adopt a more traditional workflow where a single pull request contains only a "conceptually atomic" set of changes related to each other. That way it can either be all merged or discussed and refined until merge more easily. Your pull request had work from multiple people, making changes of many unrelated types (gh-pages, .gitignore, named access, printing, etc...). I was able to cherry-pick one or two commits, but by and large I had to resort to manually copying files out of your repo, because there were some things that should not be merged, and there was no easy way to disentangle it. Here's for example the currently active pull requests on ipython: http://github.com/ipython/ipython/pulls some of them are fairly extensive, but each is conceptually atomic, so each can be studied in isolation and either will require refinement or will be merged, but nobody is going to have to dissect it into pieces to commit some and drop others. I hope this is clear, let me know if you need any more info on this. Ultimately it's a matter of making the process more efficient for all involved. **Changes where further discussion is needed** There were some things we did *not* merge. The sphinx extension isn't needed (we use a different mechanism for gh-pages that is cleaner), so we just ignored it. But the important point are the changes to named access support. I realize since the conference you've wanted this, but we really would like to proceed more carefully and implement first, only .axis.name access. The top-level access requires a changed __getattr__ method (which slows down *all* attribute access), and opens the door for name collisions. I think the best approach will be to follow the lead of numpy here: structured arrays offer only access by named key ['name'], and for plain .name access you need to make them a recarray. We should also offer in our base class only the simpler, safer mechanism, and then we can build one that uses the .name attributes as a subclass for those who want the convenience and understand the risks and tradeoffs. How does this sound to you? We should make sure we agree on the api before writing too much more code along these lines. I realize you've actually contributed an implementation of this, but I think we need to make sure it's the right design before merging it in. We do need to have this design discussion to find a class that will work for as many people as possible, so many thanks for starting by offering working code! Regards, f _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
