Hi, Apologies in advance for the long post. It boils down to this: Is there any interest from the sage community in participating in the development of a python distribution for large-scale distributed memory parallel machines? I'm posting this on behalf of (but not representing) a group of government scientists who are trying to work toward a common python distribution on the government systems we use. The reasons we're doing this are 1) we can't trust the system python on many HPC systems if it even exists; 2) Due to 1. almost all of us spend too much time building and maintaining our own python "stack" based on some mixture of make, cmake, autoconf, and/or the sage spkg system; and 3) our community suffers from the fact that we can't always share python modules and scripts on these systems because we're not working from equivalent python environments.
Here's what I think we need: 1) A standard, which specifies a python version, and a list of python packages and their dependent packages. This allows for-profit vendors to build to our standard. 2) A build system that allows extensive configuration of the entire system but with enough granularity that the format of a package is standardized and relatively straightforward. On the other hand, the whole system must be designed such that it can be built repeatedly from scratch without any interactive steps. 3) A testing system that is simple enough that the community can easily contribute tests to ensure that the community python is reliable for their needs 4) A framework for making this environment extensible without requiring forking it and creating yet more distributions Here's a straw man: 1) Standard: Python 2.7.2 PLUS: - numpy * - scipy - matplotlib * - vtk (python wrappers + C++ libs) * - elementtree * - ctypes * - readline (i.e. a functional readline extension module) * - swig - mpi4py * - petsc4py * - pympi - nose * - pytables * - basemap - cython * - sympy * - pycuda - pyopencl - IPython * - wxpython - PyQt * - pygtk - PyTrilinos - virtualenv * - Pandas - numexpr * - pygrib Note: *Our group has these in the python stack we build for our PDE solver framework (http://proteus.usace.army.mil), which we build on a range of machines at 4 major supercomputing centers. The main issue I see with 1) is that this is somewhat different from the sage package list. We would need many optional sage packages but wouldn't need some of the standard sage packages. 2) Build System: a. Use cmake* for the top level configuration, storing the part relevant for each package in a subdirectory for each package (call it package_name_Config e.g. numpyConfig, petsc4pyConfig, ...) b. store each package as an spkg** that meets sage community standards except that spkg-install will rely on information from package_name_Config (maybe it would be OK to edit files in package_name_Config located INSIDE package_name_version.spkg during the interactive configuration step?) c. each package will still get built with it's native built system*** Notes: *Our group simply uses make instead of cmake, with a top level Makefile containing 'editConfig' and 'newConfig' targets that allows you to edit and copy existing configurations **Our group only produces a top level spkg, but I think we could easily generate a finer grained set of spkg's for ones that don't already exist ***Our group does this (i.e. we don't rewrite upstream build systems). I think spkg's also use the native build system in most cases, right? The main issue with 2. (the build system) is that building on HPC systems requires extensive configuration of individual packages: numpy needs to get built with the right vendor blas/lapack and potentially the correct, non-gcc, optimizing compilers (maybe even a mixture of gcc and some vendor fortran). Likewise petsc4py might need to use PETSc libraries installed as part of the HPC baseline configuration rather than building the source included with this distribution. My impression is that sage very reasonably opted to focus on the web notebook and a gnu-based linux environment so the spkg system alone doesn't fully meet the needs of the HPC community. We need the ability to specify different compilers for different packages and to do a range of things from building every dependency to building only python wrappers for many dependencies. 3) buildbot + nose and a package_nameTest directory for community supplied tests of each package in addition to the packages' own tests. This way users only have to add test_NAME.py files to 4) virtualenv + pip should allow users to extend the python installation into a their private environment where they can update and add new packages as necessary. An issue here is that it wouldn't allow a per-user sage environment so I'm not sure whether users could also install spkg's or even use their modified python environment from sage. Anyway, I'd be grateful for any input, regardless of whether this project seems like a good fit for more formal participation from the sage community. Thanks, Chris -- To post to this group, send email to sage-support@googlegroups.com To unsubscribe from this group, send email to sage-support+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-support URL: http://www.sagemath.org