Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation
On Tue, Dec 29, 2009 at 10:27 PM, René Dudfield wrote: > Hi, > > In the toydist proposal/release notes, I would address 'what does > toydist do better' more explicitly. > > > > A big problem for science users is that numpy does not work with > pypi + (easy_install, buildout or pip) and python 2.6. > > > > Working with the rest of the python community as much as possible is > likely a good goal. Yes, but it is hopeless. Most of what is being discussed on distutils-sig is useless for us, and what matters is ignored at best. I think most people on distutils-sig are misguided, and I don't think the community is representative of people concerned with packaging anyway - most of the participants seem to be around web development, and are mostly dismissive of other's concerns (OS packagers, etc...). I want to note that I am not starting this out of thin air - I know most of distutils code very well, I have been the mostly sole maintainer of numpy.distutils for 2 years now. I have written extensive distutils extensions, in particular numscons which is able to fully build numpy, scipy and matplotlib on every platform that matters. Simply put, distutils code is horrible (this is an objective fact) and flawed beyond repair (this is more controversial). IMHO, it has almost no useful feature, except being standard. If you want a more detailed explanation of why I think distutils and all tools on top are deeply flawed, you can look here: http://cournape.wordpress.com/2009/04/01/python-packaging-a-few-observations-cabal-for-a-solution/ > numpy used to work with buildout in python2.5, but not with 2.6. > buildout lets other team members get up to speed with a project by > running one command. It installs things in the local directory, not > system wide. So you can have different dependencies per project. I don't think it is a very useful feature, honestly. It seems to me that they created a huge infrastructure to split packages into tiny pieces, and then try to get them back together, imaganing that multiple installed versions is a replacement for backward compatibility. Anyone with extensive packaging experience knows that's a deeply flawed model in general. > Plenty of good work is going on with python packaging. That's the opposite of my experience. What I care about is: - tools which are hackable and easily extensible - robust install/uninstall - real, DAG-based build system - explicit and repeatability None of this is supported by the tools, and the current directions go even further away. When I have to explain at length why the command-based design of distutils is a nightmare to work with, I don't feel very confident that the current maintainers are aware of the issues, for example. It shows that they never had to extend distutils much. > > There are build farms for windows packages and OSX uploaded to pypi. > Start uploading pre releases to pypi, and you get these for free (once > you make numpy compile out of the box on those compile farms). There > are compile farms for other OSes too... like ubuntu/debian, macports > etc. Some distributions even automatically download, compile and > package new releases once they spot a new file on your ftp/web site. I am familiar with some of those systems (PPA and opensuse build service in particular). One of the goal of my proposal is to make it easier to interoperate with those tools. I think Pypi is mostly useless. The lack of enforced metadata is a big no-no IMHO. The fact that Pypi is miles beyond CRAN for example is quite significant. I want CRAN for scientific python, and I don't see Pypi becoming it in the near future. The point of having our own Pypi-like server is that we could do the following: - enforcing metadata - making it easy to extend the service to support our needs > > pypm: http://pypm.activestate.com/list-n.html#numpy It is interesting to note that one of the maintainer of pypm has recently quitted the discussion about Pypi, most likely out of frustration from the other participants. > Documentation projects are being worked on to document, give tutorials > and make python packaging be easier all round. As witnessed by 20 or > so releases on pypi every day(and growing), lots of people are using > the python packaging tools successfully. This does not mean much IMO. Uploading on Pypi is almost required to use virtualenv, buildout, etc.. An interesting metric is not how many packages are uploaded, but how much it is used outside developers. > > I'm not sure making a separate build tool is a good idea. I think > going with the rest of the python community, and improving the tools > there is a better idea. It has been tried, and IMHO has been proved to have failed. You can look at the recent discussion (the one started by Guido in particular). > pps. some notes on toydist itself. > - toydist convert is cool for people converting a setup.py . This > means that most people can try out toydist right away. but
Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation
On Tue, Dec 29, 2009 at 10:27 PM, René Dudfield wrote: > Buildout is what a lot of the python community are using now. I would like to note that buildout is a solution to a problem that I don't care to solve. This issue is particularly difficult to explain to people accustomed with buildout in my experience - I have not found a way to explain it very well yet. Buildout, virtualenv all work by sandboxing from the system python: each of them do not see each other, which may be useful for development, but as a deployment solution to the casual user who may not be familiar with python, it is useless. A scientist who installs numpy, scipy, etc... to try things out want to have everything available in one python interpreter, and does not want to jump to different virtualenvs and whatnot to try different packages. This has strong consequences on how you look at things from a packaging POV: - uninstall is crucial - a package bringing down python is a big no no (this happens way too often when you install things through setuptools) - if something fails, the recovery should be trivial - the person doing the installation may not know much about python - you cannot use sandboxing as a replacement for backward compatibility (that's why I don't care much about all the discussion about versioning - I don't think it is very useful as long as python itself does not support it natively). In the context of ruby, this article makes a similar point: http://www.madstop.com/ruby/ruby_has_a_distribution_problem.html David -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Matplotlib-devel mailing list Matplotlib-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-devel
Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation
On Tue, Dec 29, 2009 at 11:34:44PM +0900, David Cournapeau wrote: > Buildout, virtualenv all work by sandboxing from the system python: > each of them do not see each other, which may be useful for > development, but as a deployment solution to the casual user who may > not be familiar with python, it is useless. A scientist who installs > numpy, scipy, etc... to try things out want to have everything > available in one python interpreter, and does not want to jump to > different virtualenvs and whatnot to try different packages. I think that you are pointing out a large source of misunderstanding in packaging discussion. People behind setuptools, pip or buildout care to have a working ensemble of packages that deliver an application (often a web application)[1]. You and I, and many scientific developers see libraries as building blocks that need to be assembled by the user, the scientist using them to do new science. Thus the idea of isolation is not something that we can accept, because it means that we are restricting the user to a set of libraries. Our definition of user is not the same as the user targeted by buildout. Our user does not push buttons, but he writes code. However, unlike the developer targeted by buildout and distutils, our user does not want or need to learn about packaging. Trying to make the debate clearer... Gaël [1] I know your position on why simply focusing on sandboxing working ensemble of libraries is not a replacement for backward compatibility, and will only create impossible problems in the long run. While I agree with you, this is not my point here. -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Matplotlib-devel mailing list Matplotlib-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-devel
Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation
David Cournapeau wrote: > Buildout, virtualenv all work by sandboxing from the system python: > each of them do not see each other, which may be useful for > development, And certain kinds of deployment, like web servers or installed tools. > but as a deployment solution to the casual user who may > not be familiar with python, it is useless. A scientist who installs > numpy, scipy, etc... to try things out want to have everything > available in one python interpreter, and does not want to jump to > different virtualenvs and whatnot to try different packages. Absolutely true -- which is why Python desperately needs package version selection of some sort. I've been tooting this horn on and off for years but never got any interest at all from the core python developers. I see putting packages in with no version like having non-versioned dynamic libraries in a system -- i.e. dll hell. If I have a bunch of stuff running just fine with the various package versions I've installed, but then I start working on something (maybe just testing, maybe something more real) that requires the latest version of a package, I have a few choices: - install the new package and hope I don't break too much - use something like virtualenv, which requires a lot of overhead to setup and use (my evidence is personal, despite working with a team that uses it, somehow I've never gotten around to using for my dev work, even though, in theory, it should be a good solution) - setuptools does supposedly support multiple version installs and selection, but it's ugly and poorly documented enough that I've never figured out how to use it. This has been addressed with a handful of ad-hock solution: wxPython as wxversion.select, and I think PyGTK has something, and who knows what else. It would be really nice to have a standard solution available. Note that the usual response I've gotten is to use py2exe or something to distribute, so you're defining the whole stack. That's good for some things, but not all (though py2app's "alias" bundles are nice), and really pretty worthless for development. Also, many, many packages are a pain to use with py2exe and friends anyway (see my forthcoming other long post...) > - you cannot use sandboxing as a replacement for backward > compatibility (that's why I don't care much about all the discussion > about versioning - I don't think it is very useful as long as python > itself does not support it natively). could be -- I'd love to have Python support it natively, though wxversion isn't too bad. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Matplotlib-devel mailing list Matplotlib-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-devel
Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation
On Wed, Dec 30, 2009 at 3:36 AM, René Dudfield wrote: > On Tue, Dec 29, 2009 at 2:34 PM, David Cournapeau wrote: >> On Tue, Dec 29, 2009 at 10:27 PM, René Dudfield wrote: >> >>> Buildout is what a lot of the python community are using now. >> >> I would like to note that buildout is a solution to a problem that I >> don't care to solve. This issue is particularly difficult to explain >> to people accustomed with buildout in my experience - I have not found >> a way to explain it very well yet. > > Hello, > > The main problem buildout solves is getting developers up to speed > very quickly on a project. They should be able to call one command > and get dozens of packages, and everything else needed ready to go, > completely isolated from the rest of the system. > > If a project does not want to upgrade to the latest versions of > packages, they do not have to. This reduces the dependency problem a > lot. As one package does not have to block on waiting for 20 other > packages. It makes iterating packages daily, or even hourly to not be > a problem - even with dozens of different packages used. This is not > theoretical, many projects iterate this quickly, and do not have > problems. > > Backwards compatibility is of course a great thing to keep up... but > harder to do with dozens of packages, some of which are third party > ones. For example, some people are running pygame applications > written 8 years ago that are still running today on the latest > versions of pygame. I don't think people in the python world > understand API, and ABI compatibility as much as those in the C world. > > However buildout is a solution to their problem, and allows them to > iterate quickly with many participants, on many different projects. > Many of these people work on maybe 20-100 different projects at once, > and some machines may be running that many applications at once too. > So using the system pythons packages is completely out of the question > for them. This is all great, but I don't care about solving this issue, this is a *developer* issue. I don't mean this is not an important issue, it is just totally out of scope. The developer issues I care about are much more fine-grained (corrent dependency handling between target, toolchain customization, etc...). Note however that hopefully, by simplifying the packaging tools, the problems you see with numpy on 2.6 would be less common. The whole distutils/setuptools/distribute stack is hopelessly intractable, given how messy the code is. > > It is very easy to include a dozen packages in a buildout, so that you > have all the packages required. I think there is a confusion - I mostly care about *end users*. People who may not have compilers, who want to be able to easily upgrade one package, etc... David -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Matplotlib-devel mailing list Matplotlib-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-devel