Hi Benda, I agree regarding, space; it's cheap so a few GB doesn't matter. The main reason for wanting it to keep small with most libraries used from the host would be that importing things would be faster. Our NFS storage is sometimes very when accessing many small files, so it would be nice to keep the size down, but it's not the most important issue.
Another problem I had was getting X11 applications in the prefix to work. I assume because library versions are different than the ones of the running X11 server. Does this work for you? Best, Martin On Sat, Dec 21, 2013 at 9:16 PM, <[email protected]> wrote: > Dear Martin, > > Martin Luessi <[email protected]> writes: > >> First, let me explain the reason for why anyone would want to do so. >> For work, I use Python extensively for scientific computing. However, >> I do not have administrator rights on my workstation and the >> distribution we use (CentOS) does not have the latest Python packages >> that are needed for scientific computing. In addition, even if CentOS >> had the packages, it wouldn't be feasible to constantly ask the >> sysadmins to install/update packages. One solution is to use a >> scientific Python distribution from a commercial vendor, e.g., Canopy >> from Enthought or Anaconda from Continuum Analytics. While these >> distributions work quite well, they are expensive for non-academic >> users and they are not very flexible, i.e., it can be difficult to >> install packages that are not in the package repository provided by >> the vendor, especially if the packages need additional dependencies. I >> also have a gentoo-prefix setup on my workstation. > > Me too, I use Gentoo Prefix for Python-centered scientific computing on > the cluster of my institute. > >> However, the whole prefix directory is very large as it makes minimal >> assumptions about the libraries provided by the host system. The size >> is a problem when using it over NFS e.g. on a cluster. Also, I have >> found that it is difficult to get X11 applications working as the >> gentoo-prefix will install its own X server etc. >> >> This made me wonder whether portage could be used to build a >> scientific Python installation. My idea is instead of making very >> minimal assumptions about the libraries provided by the host system >> (as done in a normal prefix install), one could generate a world file >> listing all the libraries provided by the host system and freeze their >> versions using package-mask. Like that, programs and libraries in the >> prefix would link to libraries on the host system whenever possible, >> which would make the prefix smaller. By having a gentoo based >> scientific Python installation, one could take advantage of all the >> packages provided by gentoo-science and it would make it easy to >> install Python packages that depend on non-Python libraries. >> What do you guys think, is this feasible? > > Let me try to argue against it. > > 1. The disk space is extremely cheap now, $1/GB. Prefix will occupy at > most 5GB, with an average of 2GB and minimal of less than 1GB. > > 1a. NFS is not cool to throw the build directory onto. > What I do is to set PORTAGE_TMPDIR="/dev/shm" or whatever > tmpfs. Then you can achieve a modest speed of building. > > 2. We are actually doing the other way round: Isolate from the host > libraries as much as possible. We have even reached a (experimental) > stage where only the kernel of the host is used[a]. > > Why? Because trying to be compatible with a large range of versions > of libraries is not possible. Even the kernel version could break > something[b], and even the present Prefix get broken by some > unexpectedly behaved host libraries. Redhat build their product on > ancient software for a reason: stability. > > My thought is to ignore the space Prefix occupies and focus on the > features, stability/maintainability instead. > > Benda > > a. > http://blogs.gentoo.org/news/2013/11/01/gentoo-monthly-newsletter-31-october-2013/#RAP > b. https://bugs.gentoo.org/show_bug.cgi?id=493074 >
