Re: [Numpy-discussion] Possible to use numexpr with us er made ufuncs/scipy ufuncs?
A Saturday 26 June 2010 19:17:43 Pauli Virtanen escrigué: But what if such an expression does not exist? For example, there is no finite closed form expression for the Bessel functions in terms of elementary functions. An accurate implementation is rather complicated, and usually piecewise defined. For convenience reasons, it could be useful if one could do something like numexpr.evaluate(cos(iv(0, x)), functions=dict(iv=scipy.special.iv)) and this would be translated to numexpr bytecode that would make a Python function call to obtain iv(0, x) for each block of data required, assuming iv is a vectorized function. It's of course possible to precompute the value of iv(0, x), but this is extra hassle and requires additional memory. Yeah, I can see the merit of implementing such a thing, mainly for avoiding additional memory consumption. But again, the nice thing would be to implement such a special functions in terms of numexpr expressions so that the evaluation itself can be faster. Admittedly, that would take a bit more time. Anyway, if someone comes with patches for implementing this, I'd glad to commit them. -- Francesc Alted ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Possible to use numexpr with user made ufuncs/scipy ufuncs?
ma, 2010-06-28 kello 09:48 +0200, Francesc Alted kirjoitti: [clip] But again, the nice thing would be to implement such a special functions in terms of numexpr expressions so that the evaluation itself can be faster. Admittedly, that would take a bit more time. Quite often, you need to evaluate a series or continued fraction expansion to get the value of a special function at some point, limiting the number of terms by stopping when a certain convergence criterion is satisfied. Also, which series to sum typically depends on the point where things are evaluated. These things don't vectorize very nicely. If I understand correctly, numexpr bytecode interpreter does not support conditionals and loops like this at the moment? The special function implementations are typically written in bare C/Fortran. Do you think numexpr could give speedups there? As I see it, speedups (at least with MKL) could come from using faster implementations of sin/cos/exp etc. basic functions. Using SIMD to maximum effect would then require an amount of cleverness in re-writing the evaluation algorithms, which sounds like a major amount of work. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Is numpy ignoring CFLAGS?
Dr. David Kirkby wrote: On some 64-bit platforms, which include, but is not limited to: * Some version of OS X (I don't know what versions or processors) * Solaris on SPARC processors. * Solaris on x86 processors. * OpenSolaris on SPARC processors. * OpenSolaris on x86 processors. * HP-UX on PA-RISC processors. the default is to build 32-bit objects, but 64-bit objects can be created if needed. This is usually done via adding the -m64 flag when compiling with GCC or SunStudio, though the flag will be different with HP's compiler. Numpy is used as part of Sage, but it would appear that adding -m64 to CFLAGS will not work. A comment in the script used to build numpy shows: First: Is Python built using -m64? If not, is there a reason that NumPy in 64 bit and load it into 32 bit Python work? If Python is built with -m64 I'd expect NumPy to pick it up automatically as it queries Python for the build flags to use... # numpy's distutils is buggy and runs a conftest without # taking CFLAGS into account. With 64 bit OSX this results # in *boom* it then goes on to copy a file called gcc_fake, which is basically a script which gets renamed to gcc, but includes the -m64 flag. We are using numpy-1.3.0. Is this a known bug? If not, can I submit it to a bug database? Better still, does anyone have a patch to resolve it - I hate the idea of making Until somebody who really knows an answer chimes in; AFAIK this is a feature in distutils itself, so it affects most Python software. (Setting CFLAGS overwrites the necesarry CFLAGS settings, like -fPIC and -fno-strict-aliasing, that is queried from Python). Try setting OPT instead? Dag Sverre ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Possible to use numexpr with user made ufuncs/scipy ufuncs?
A Monday 28 June 2010 10:22:31 Pauli Virtanen escrigué: ma, 2010-06-28 kello 09:48 +0200, Francesc Alted kirjoitti: [clip] But again, the nice thing would be to implement such a special functions in terms of numexpr expressions so that the evaluation itself can be faster. Admittedly, that would take a bit more time. Quite often, you need to evaluate a series or continued fraction expansion to get the value of a special function at some point, limiting the number of terms by stopping when a certain convergence criterion is satisfied. Also, which series to sum typically depends on the point where things are evaluated. These things don't vectorize very nicely. If I understand correctly, numexpr bytecode interpreter does not support conditionals and loops like this at the moment? No, it does not support loops. And conditionals are supported only via vectorized conditionals (i.e. the `where()` opcode). The special function implementations are typically written in bare C/Fortran. Do you think numexpr could give speedups there? As I see it, speedups (at least with MKL) could come from using faster implementations of sin/cos/exp etc. basic functions. Using SIMD to maximum effect would then require an amount of cleverness in re-writing the evaluation algorithms, which sounds like a major amount of work. Okay. I thought that special functions were more 'vectorizable', or that it could be expressed in terms of more basic functions (sin/cos/exp). But if this is not the case, then it is quite clear that the best solution would be your first suggestion (i.e. implement user-provide ufunc evaluation). And definitely, implementing special functions in terms of SIMD would really be a *major* effort, and only doable by very specialized people ;-) -- Francesc Alted ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Is numpy ignoring CFLAGS?
On 06/28/10 09:38 AM, Dag Sverre Seljebotn wrote: Dr. David Kirkby wrote: On some 64-bit platforms, which include, but is not limited to: * Some version of OS X (I don't know what versions or processors) * Solaris on SPARC processors. * Solaris on x86 processors. * OpenSolaris on SPARC processors. * OpenSolaris on x86 processors. * HP-UX on PA-RISC processors. the default is to build 32-bit objects, but 64-bit objects can be created if needed. This is usually done via adding the -m64 flag when compiling with GCC or SunStudio, though the flag will be different with HP's compiler. Numpy is used as part of Sage, but it would appear that adding -m64 to CFLAGS will not work. A comment in the script used to build numpy shows: First: Is Python built using -m64? If not, is there a reason that NumPy in 64 bit and load it into 32 bit Python work? If Python is built with -m64 I'd expect NumPy to pick it up automatically as it queries Python for the build flags to use... Yes, Python is built 64-bit, using the -m64 option. # numpy's distutils is buggy and runs a conftest without # taking CFLAGS into account. With 64 bit OSX this results # in *boom* it then goes on to copy a file called gcc_fake, which is basically a script which gets renamed to gcc, but includes the -m64 flag. We are using numpy-1.3.0. Is this a known bug? If not, can I submit it to a bug database? Better still, does anyone have a patch to resolve it - I hate the idea of making Until somebody who really knows an answer chimes in; AFAIK this is a feature in distutils itself, so it affects most Python software. (Setting CFLAGS overwrites the necesarry CFLAGS settings, like -fPIC and -fno-strict-aliasing, that is queried from Python). Try setting OPT instead? Dag Sverre OPT has -m64 in it. This is the bit that shows how Python is built on Solaris (uname=SunOS). SAGE64 will be set to yes for a 64-bit build. OPT=-g -O3 -m64 -Wall -Wstrict-prototypes; export OPT ./configure $EXTRAFLAGS --prefix=$SAGE_LOCAL \ --enable-unicode=ucs4 --with-gcc=gcc -m64 Many other parts of Sage seem to inherit the flags ok from Python, but not numpy. It is not a Solaris specific issue, as the same issue results on OS X. Dave Dave ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Is numpy ignoring CFLAGS?
On Mon, Jun 28, 2010 at 6:56 PM, Dr. David Kirkby david.kir...@onetel.net wrote: On 06/28/10 09:38 AM, Dag Sverre Seljebotn wrote: Dr. David Kirkby wrote: On some 64-bit platforms, which include, but is not limited to: * Some version of OS X (I don't know what versions or processors) * Solaris on SPARC processors. * Solaris on x86 processors. * OpenSolaris on SPARC processors. * OpenSolaris on x86 processors. * HP-UX on PA-RISC processors. the default is to build 32-bit objects, but 64-bit objects can be created if needed. This is usually done via adding the -m64 flag when compiling with GCC or SunStudio, though the flag will be different with HP's compiler. Numpy is used as part of Sage, but it would appear that adding -m64 to CFLAGS will not work. A comment in the script used to build numpy shows: First: Is Python built using -m64? If not, is there a reason that NumPy in 64 bit and load it into 32 bit Python work? If Python is built with -m64 I'd expect NumPy to pick it up automatically as it queries Python for the build flags to use... Yes, Python is built 64-bit, using the -m64 option. # numpy's distutils is buggy and runs a conftest without # taking CFLAGS into account. With 64 bit OSX this results # in *boom* it then goes on to copy a file called gcc_fake, which is basically a script which gets renamed to gcc, but includes the -m64 flag. We are using numpy-1.3.0. Is this a known bug? If not, can I submit it to a bug database? Better still, does anyone have a patch to resolve it - I hate the idea of making Until somebody who really knows an answer chimes in; AFAIK this is a feature in distutils itself, so it affects most Python software. (Setting CFLAGS overwrites the necesarry CFLAGS settings, like -fPIC and -fno-strict-aliasing, that is queried from Python). Try setting OPT instead? Dag Sverre OPT has -m64 in it. This is the bit that shows how Python is built on Solaris (uname=SunOS). SAGE64 will be set to yes for a 64-bit build. OPT=-g -O3 -m64 -Wall -Wstrict-prototypes; export OPT ./configure $EXTRAFLAGS --prefix=$SAGE_LOCAL \ --enable-unicode=ucs4 --with-gcc=gcc -m64 Many other parts of Sage seem to inherit the flags ok from Python, but not numpy. Are you saying that OPT is not taken into account ? It seems to work for me, e.g. OPT=-m64 python setup.py build_ext does put -m64 somewhere in CFLAGS. When using numpy.distutils, CFLAGS should never be overriden unless you are ready to set up the whole set of options manually. By default, CFLAGS is the concatenation of BASECFLAGS, OPT and CCSHARED (in that order), and only OPT should be tweaked in general. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Is numpy ignoring CFLAGS?
On 06/28/10 11:28 AM, David Cournapeau wrote: On Mon, Jun 28, 2010 at 6:56 PM, Dr. David Kirkby Many other parts of Sage seem to inherit the flags ok from Python, but not numpy. Are you saying that OPT is not taken into account ? It seems to work for me, e.g. OPT=-m64 python setup.py build_ext does put -m64 somewhere in CFLAGS. When using numpy.distutils, CFLAGS should never be overriden unless you are ready to set up the whole set of options manually. By default, CFLAGS is the concatenation of BASECFLAGS, OPT and CCSHARED (in that order), and only OPT should be tweaked in general. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion OPT is not being totally ignored, but some part of numpy is trying to build without taking -m64 into account. Note in the output below that as numpy is compiled, -m64 is shown on some lines, which is what I expect. But note also the messages about wrong ELF class: ELFCLASS64 from the linker, indicating to me the linker is not expecting to find 64-bit objects. All the libraries in the directory /export/home/drkirkby/sage-4.5.alpha0/local/lib are 64-bit. Someone worked around this on OS X by creating a script called 'gcc_fake', with this content #!/bin/bash /usr/bin/gcc -m64 $@ If a 64-bit build was going to be done, this was remaned to 'gcc' and put in the path first before building numpy. Once numpy was built, the script can be deleted. All the script does is basically invoke gcc with the -m64 option, which is making me think that -m64 is being missed somewhere. I just deleted this gcc_fake, as it had a hard-coded path. I then created it dynamically, so the path of the real gcc does not need to be /usr/bin/gcc. Doing that, I can build numpy 64-bit on OpenSolaris. But of course this is a hack, and I'd rather avoid the hack if possible. BTW, if it would help, I could create you an account on this machine and test it yourself. I'm not trying to get out of doing the work, but the offer is there. Dave numpy-1.3.0.p3/spkg-install numpy-1.3.0.p3/.hgignore Finished extraction Host system uname -a: SunOS hawk 5.11 snv_134 i86pc i386 i86pc CC Version gcc -v Using built-in specs. Target: i386-pc-solaris2.11 Configured with: /export/home/drkirkby/gcc-4.4.4/configure --prefix=/usr/local/gcc-4.4.4-multilib --enable-languages=c,c++,fortran --with-gmp=/usr/local/gcc-4.4.4-multilib --with-mpfr=/usr/local/gcc-4.4.4-multilib --disable-nls --enable-checking=release --enable-werror=no --enable-multilib --with-system-zlib --enable-bootstrap --with-gnu-as --with-as=/usr/local/binutils-2.20/bin/as --without-gnu-ld --with-ld=/usr/ccs/bin/ld Thread model: posix gcc version 4.4.4 (GCC) Running from numpy source directory. F2PY Version 2 blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not found in /export/home/drkirkby/sage-4.5.alpha0/local/lib NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /export/home/drkirkby/sage-4.5.alpha0/local/lib NOT AVAILABLE atlas_blas_info: FOUND: libraries = ['f77blas', 'cblas', 'atlas'] library_dirs = ['/export/home/drkirkby/sage-4.5.alpha0/local/lib'] language = c include_dirs = ['/export/home/drkirkby/sage-4.5.alpha0/local/include'] /export/home/drkirkby/sage-4.5.alpha0/spkg/build/numpy-1.3.0.p3/src/numpy/distutils/command/config.py:361: DeprecationWarning: + Usage of get_output is deprecated: please do not use it anymore, and avoid configuration checks involving running executable on the target machine. + DeprecationWarning) customize Sage_FCompiler_1 customize Sage_FCompiler_1 customize Sage_FCompiler_1 using config compiling '_configtest.c': /* This file is generated from numpy/distutils/system_info.py */ void ATL_buildinfo(void); int main(void) { ATL_buildinfo(); return 0; } C compiler: gcc -fno-strict-aliasing -g -O2 -DNDEBUG -g -O3 -m64 -Wall -Wstrict-prototypes -fPIC compile options: '-c' gcc: _configtest.c gcc _configtest.o -L/export/home/drkirkby/sage-4.5.alpha0/local/lib -lf77blas -lcblas -latlas -o _configtest ld: fatal: file _configtest.o: wrong ELF class: ELFCLASS64 ld: fatal: file processing errors. No output written to _configtest collect2: ld returned 1 exit status ld: fatal: file _configtest.o: wrong ELF class: ELFCLASS64 ld: fatal: file processing errors. No output written to _configtest collect2: ld returned 1 exit status failure. removing: _configtest.c _configtest.o Status: 255 Output: FOUND: libraries = ['f77blas', 'cblas', 'atlas']
Re: [Numpy-discussion] numscons and Python 2.7 problems
On Fri, Jun 25, 2010 at 12:05 PM, Christoph Gohlke cgoh...@uci.edu wrote: On 6/7/2010 1:58 PM, Charles R Harris wrote: On Mon, Jun 7, 2010 at 2:07 PM, Christoph Gohlke cgoh...@uci.edu mailto:cgoh...@uci.edu wrote: Dear NumPy developers, I was trying to build numpy 2.0.dev and scipy 0.9.dev using numscons 0.12.dev and Python 2.7b2 (32 bit) on Windows 7 64 bit and ran into two problems. First, Python 2.7's UserDict is now a new-style class (http://docs.python.org/dev/whatsnew/2.7.html). Apparently scons 1.2, which is part of numscons, is not compatible with this change and an AttributeError is thrown: I wonder sometimes if it is worth supporting python versions 2.x 2.6. The developers seem intent on breaking the API as much as possible and support could become a burden. At least 3.x has a policy of keeping the API intact. I'm only half joking here,. Good news: Python's 2.7 UserDict has been reverted to an old-style class after release candidate 2. See discussion at http://mail.python.org/pipermail/python-dev/2010-June/thread.html#100885 I just successfully built from source numpy on todays 2.7 snapshot with the following command. LDFLAGS=-arch x86_64 FFLAGS=-arch x86_64 py27 setupscons.py scons Vincent Christoph ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Is numpy ignoring CFLAGS?
On Mon, Jun 28, 2010 at 9:28 PM, Dr. David Kirkby david.kir...@onetel.net wrote: On 06/28/10 11:28 AM, David Cournapeau wrote: On Mon, Jun 28, 2010 at 6:56 PM, Dr. David Kirkby Many other parts of Sage seem to inherit the flags ok from Python, but not numpy. Are you saying that OPT is not taken into account ? It seems to work for me, e.g. OPT=-m64 python setup.py build_ext does put -m64 somewhere in CFLAGS. When using numpy.distutils, CFLAGS should never be overriden unless you are ready to set up the whole set of options manually. By default, CFLAGS is the concatenation of BASECFLAGS, OPT and CCSHARED (in that order), and only OPT should be tweaked in general. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion OPT is not being totally ignored, but some part of numpy is trying to build without taking -m64 into account. Note in the output below that as numpy is compiled, -m64 is shown on some lines, which is what I expect. But note also the messages about wrong ELF class: ELFCLASS64 from the linker, indicating to me the linker is not expecting to find 64-bit objects. Ah, that makes the issue much clearer: the linker is not passed the -m64 option. It works with distutils because CFLAGS is appended to LDFLAGS if CFLAGS is in os.environ,but we use CFLAGS differently. I am not sure how to fix that issue... David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.load raising IOError but EOFError expected
Sorry I had no access during these days. Thanks for the answer Friedrich, I had already checked numpy.savez, but unfortunately I cannot make use of it. I don't have all the data needed to be saved at the same time...it is produced each time I run a test. Thanks anyway! Any other idea why this is happening? Is it expected behavior? On Thu, Jun 24, 2010 at 7:30 PM, Friedrich Romstedt friedrichromst...@gmail.com wrote: 2010/6/23 Ruben Salvador rsalvador...@gmail.com: Therefore, is this a bug? Shouldn't EOFError be raised instead of IOError? Or am I missunderstanding something? If this is not a bug, how can I detect the EOF to stop reading (I expect a way for this to work without tweaking the code with saving first in the file the number of dumps done)? Maybe you can make use of numpy.savez, http://docs.scipy.org/doc/numpy/reference/generated/numpy.savez.html#numpy-savez . Friedrich ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Rubén Salvador PhD student @ Centro de Electrónica Industrial (CEI) http://www.cei.upm.es Blog: http://aesatcei.wordpress.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.load raising IOError but EOFError expected
ke, 2010-06-23 kello 12:46 +0200, Ruben Salvador kirjoitti: [clip] how can I detect the EOF to stop reading r = f.read(1) if not r: break # EOF else: f.seek(-1, 1) -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Is numpy ignoring CFLAGS?
On 06/28/10 02:54 PM, David Cournapeau wrote: On Mon, Jun 28, 2010 at 9:28 PM, Dr. David Kirkby david.kir...@onetel.net wrote: On 06/28/10 11:28 AM, David Cournapeau wrote: On Mon, Jun 28, 2010 at 6:56 PM, Dr. David Kirkby Many other parts of Sage seem to inherit the flags ok from Python, but not numpy. Are you saying that OPT is not taken into account ? It seems to work for me, e.g. OPT=-m64 python setup.py build_ext does put -m64 somewhere in CFLAGS. When using numpy.distutils, CFLAGS should never be overriden unless you are ready to set up the whole set of options manually. By default, CFLAGS is the concatenation of BASECFLAGS, OPT and CCSHARED (in that order), and only OPT should be tweaked in general. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion OPT is not being totally ignored, but some part of numpy is trying to build without taking -m64 into account. Note in the output below that as numpy is compiled, -m64 is shown on some lines, which is what I expect. But note also the messages about wrong ELF class: ELFCLASS64 from the linker, indicating to me the linker is not expecting to find 64-bit objects. Ah, that makes the issue much clearer: the linker is not passed the -m64 option. It works with distutils because CFLAGS is appended to LDFLAGS if CFLAGS is in os.environ,but we use CFLAGS differently. I am not sure how to fix that issue... David I've tried adding -m64 to LDFLAGS before exporting that. That does not help. Also, the Sun linker has this option -64 (note, no letter 'm'). drkir...@hawk:~$ man ld Reformatting page. Please Wait... done User Commands ld(1) NAME ld - link-editor for object files SYNOPSIS ld [-32 | -64] [-a | -r] [-b] [-Bdirect | nodirect] [-B dynamic | static] [-B eliminate] [-B group] [-B local] [-B reduce] [-B symbolic] [-c name] [-C] [-d y | n] [-D token,...] [-e epsym] [-f name | -F name] [-G] [-h name] [-i] [-I name] [-l x] [-L path] [-m] [-M mapfile] [-N string] [-o outfile] [-p auditlib] [-P auditlib] [-Q y | n] [-R path] [-s] [-S supportlib] [-t] [-u symname] [-V] [-Y P,dirlist] [-z absexec] [-z allextract | defaultextract | weakextract ] [-z altexec64] [-z combreloc | nocombreloc ] [-z defs | nodefs] [-z direct | nodirect] [-z endfiltee] [-z finiarray=function] [-z globalaudit] [-z groupperm | nogroupperm] [-z help ] [-z ignore | record] [-z initarray=function] [-z initfirst] [-z interpose] [-z lazyload | nolazyload] [-z ld32=arg1,arg2,...] [-z ld64=arg1,arg2,...] [-z loadfltr] [-z muldefs] [-z nocompstrtab] [-z nodefaultlib] [-z nodelete] [-z nodlopen] [-z nodump] [-z noldynsym] [-z nopartial] [-z noversion] [-z now] [-z origin] [-z preinitarray=function] [-z redlocsym] [-z relaxreloc] [-z rescan-now] [-z recan] [-z rescan-start ... -z rescan-end]] [-z target=sparc|x86] [-z text | textwarn | textoff] [-z verbose] [-z wrap=symbol] filename... DESCRIPTION The link-editor, ld, combines relocatable object files by resolving symbol references to symbol definitions, together snip OPTIONS The following options are supported. -32 | -64 Creates a 32-bit, or 64-bit object. By default, the class of the object being generated is determined from the first ELF object processed from the command line. If no objects are specified, the class is determined by the first object encountered within the first archive processed from the command line. If there are no objects or archives, the link-editor creates a 32-bit object. The -64 option is required to create a 64-bit object solely from a mapfile. The -32 or -64 options can also be used in the rare case of linking entirely from an archive that contains a mix- ture of 32 and 64-bit objects. If the first object in the archive is not the class of the object that is required to be created, then the -32 or -64 option can be used to direct the link-editor. SunOS 5.11 Last change: 3 Dec 20093 User Commands ld(1) I tried adding that to LDFLAGS. Again it did not help. The *only* solution I can find to date is to create a script which invokes gcc with the -m64 option, but that is a horrible hack. Dave ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Decorator to access intermediate results in a function/algorithm
Dear everybody, This message belongs only marginally to a numpy-related mailing list, but I thought it might be of interest here since it addresses what I believe is a common pattern in scientific development. My apologies if that is not the case... The code can be found at http://github.com/pberkes/persistent_locals and requires the byteplay library (http://code.google.com/p/byteplay/). The problem = In scientific development, functions often represent complex data processing algorithm that transform input data into a desired output. Internally, the function typically requires several intermediate results to be computed and stored in local variables. As a simple toy example, consider the following function, that takes three arguments and returns True if the sum of the arguments is smaller than their product: def is_sum_lt_prod(a,b,c): sum = a+b+c prod = a*b*c return sumprod A frequently occurring problem is that the developer/final user may need to access the intermediate results at a later stage, in order to analyze the detailed behavior of the algorithm, for debugging, or to write more comprehensive tests. A possible solution would be to re-define the function and return the needed internal variables, but this would break the existing code. A better solution would be to add a keyword argument to return more information: def is_sum_lt_prod(a,b,c, internals=False): sum = a+b+c prod = a*b*c if internals: return sumprod, sum, prod else: return sumprod This would keep the existing code intact, but only moves the problem to later stages of the development. If successively the developer needs access to even more local variables, the code has to be modified again, and part of the code is broken. Moreover, this style leads to ugly code like res, _, _, _, var1, _, var3 = f(x) where most of the returned values are irrelevant. Proposed solution = The proposed solution consists in a decorator that makes the local variables accessible from a function attribute, 'locals'. For example: @persistent_locals def is_sum_lt_prod(a,b,c): sum = a+b+c prod = a*b*c return sumprod After calling the function (e.g. is_sum_lt_prod(2,1,2), which returns False) we can analyze the intermediate results as is_sum_lt_prod.locals - {'a': 2, 'b': 1, 'c': 2, 'prod': 4, 'sum': 5} This style is cleaner, is consistent with the principle of identifying the value returned by a function as the output of an algorithm, and is robust to changes in the needs of the researcher. Note that the local variables are saved even in case of an exception, which turns out to be quite useful for debugging. How it works = The local variables in the inner scope of a function are not easily accessible. One solution (which I have not tried) may be to use tracing code like the one used in a debugger. This, however, would have a considerable cost in time. The proposed approach is to wrap the function in a callable object, and modify its bytecode by adding an external try...finally statement as follows: def f(self, *args, **kwargs): try: ... old code ... finally: self.locals = locals().copy() del self.locals['self'] The reason for wrapping the function in a class, instead of saving the locals in a function attribute directly, is that there are all sorts of complications in referring to itself from within a function. For example, referring to the attribute as f.locals results in the bytecode looking for the name 'f' in the namespace, and therefore moving the function, e.g. with g = f del f would break 'g'. There are even more problems for functions defined in a closure. I tried modfying f.func_globals with a custom dictionary which keeps a reference to f.func_globals, adding a static element to 'f', but this does not work as the Python interpreter does not call the func_globals dictionary with Python calls but directly with PyDict_GetItem (see http://osdir.com/ml/python.ideas/2007-11/msg00092.html). It is thus impossible to re-define __getitem__ to return 'f' as needed. Ideally, one would like to define a new closure for the function with a cell variable containing the reference, but this is impossible at present as far as I can tell. An alternative solution (see persistent_locals_with_kwarg in deco.py) is to change the signature of the function with an additional keyword argument f(arg1, arg2, _self_ref=f). However, this approach breaks functions that define an *args argument. Cost The increase in execution time of the decorated function is minimal. Given its domain of application, most of the functions will take a significant amount of time to complete, making the cost the decoration negligible: import time def f(x): time.sleep(0.5) return 2*x df = deco.persistent_locals(f) %timeit f(1) 10 loops, best of 3: 501 ms per loop %timeit df(1) 10 loops, best of 3: 502 ms per loop Conclusion The problem of
Re: [Numpy-discussion] Decorator to access intermediate results in a function/algorithm
What are the implications of this with respect to memory usage? When working with large arrays, if the intermediate values of a number of functions are kept around (whether we want to access them or not) could this not lead to excessive memory usage? Maybe this behavior should only apply when (as you suggest in the counterexample) a locals=True kwarg is passed in. It seems like a lot of the maintainability issues raised in the counterexample could be solved by returning a dictionary or a bunch [1] instead of a tuple -- though that still (without care on the part of the user) has the keeping around references too much stuff problem. [1] http://code.activestate.com/recipes/52308-the-simple-but-handy-collector-of-a-bunch-of-named/ Mike On 06/28/2010 12:35 PM, Pietro Berkes wrote: Dear everybody, This message belongs only marginally to a numpy-related mailing list, but I thought it might be of interest here since it addresses what I believe is a common pattern in scientific development. My apologies if that is not the case... The code can be found at http://github.com/pberkes/persistent_locals and requires the byteplay library (http://code.google.com/p/byteplay/). The problem = In scientific development, functions often represent complex data processing algorithm that transform input data into a desired output. Internally, the function typically requires several intermediate results to be computed and stored in local variables. As a simple toy example, consider the following function, that takes three arguments and returns True if the sum of the arguments is smaller than their product: def is_sum_lt_prod(a,b,c): sum = a+b+c prod = a*b*c return sumprod A frequently occurring problem is that the developer/final user may need to access the intermediate results at a later stage, in order to analyze the detailed behavior of the algorithm, for debugging, or to write more comprehensive tests. A possible solution would be to re-define the function and return the needed internal variables, but this would break the existing code. A better solution would be to add a keyword argument to return more information: def is_sum_lt_prod(a,b,c, internals=False): sum = a+b+c prod = a*b*c if internals: return sumprod, sum, prod else: return sumprod This would keep the existing code intact, but only moves the problem to later stages of the development. If successively the developer needs access to even more local variables, the code has to be modified again, and part of the code is broken. Moreover, this style leads to ugly code like res, _, _, _, var1, _, var3 = f(x) where most of the returned values are irrelevant. Proposed solution = The proposed solution consists in a decorator that makes the local variables accessible from a function attribute, 'locals'. For example: @persistent_locals def is_sum_lt_prod(a,b,c): sum = a+b+c prod = a*b*c return sumprod After calling the function (e.g. is_sum_lt_prod(2,1,2), which returns False) we can analyze the intermediate results as is_sum_lt_prod.locals - {'a': 2, 'b': 1, 'c': 2, 'prod': 4, 'sum': 5} This style is cleaner, is consistent with the principle of identifying the value returned by a function as the output of an algorithm, and is robust to changes in the needs of the researcher. Note that the local variables are saved even in case of an exception, which turns out to be quite useful for debugging. How it works = The local variables in the inner scope of a function are not easily accessible. One solution (which I have not tried) may be to use tracing code like the one used in a debugger. This, however, would have a considerable cost in time. The proposed approach is to wrap the function in a callable object, and modify its bytecode by adding an external try...finally statement as follows: def f(self, *args, **kwargs): try: ... old code ... finally: self.locals = locals().copy() del self.locals['self'] The reason for wrapping the function in a class, instead of saving the locals in a function attribute directly, is that there are all sorts of complications in referring to itself from within a function. For example, referring to the attribute as f.locals results in the bytecode looking for the name 'f' in the namespace, and therefore moving the function, e.g. with g = f del f would break 'g'. There are even more problems for functions defined in a closure. I tried modfying f.func_globals with a custom dictionary which keeps a reference to f.func_globals, adding a static element to 'f', but this does not work as the Python interpreter does not call the func_globals dictionary with Python calls but directly with PyDict_GetItem (see http://osdir.com/ml/python.ideas/2007-11/msg00092.html). It is thus impossible to re-define
Re: [Numpy-discussion] Decorator to access intermediate results in a function/algorithm
Dear Mike, Thanks for the feedback. On Mon, Jun 28, 2010 at 12:51 PM, Michael Droettboom md...@stsci.edu wrote: What are the implications of this with respect to memory usage? When working with large arrays, if the intermediate values of a number of functions are kept around (whether we want to access them or not) could this not lead to excessive memory usage? Maybe this behavior should only apply when (as you suggest in the counterexample) a locals=True kwarg is passed in. I've been thinking about it, but I haven't decided for a final implementation yet. I find it a bit messy to add a new kwarg to the signature of an existing function, as it might conflict with an existing *args argument. For example, redefining f(x, *args) as f(x, locals=True, *args) would break code calling f as f(1, 2, 3) . There are several alternatives: 1) add to the wrapping class a property to switch on and on the behavior of the decorator 2) introduce a naming convention (e.g., variables whose name begins with '_' are not saved) 3) have an option to dump the local variables to a file The solution I prefer so far is the second, but since I never had the problem in my code so far I'm not sure which option is most useful in practice. It seems like a lot of the maintainability issues raised in the counterexample could be solved by returning a dictionary or a bunch [1] instead of a tuple -- though that still (without care on the part of the user) has the keeping around references too much stuff problem. [1] http://code.activestate.com/recipes/52308-the-simple-but-handy-collector-of-a-bunch-of-named/ It's true that the counter-example is slightly unrealistic, although I have seen similar bits of code in real-life examples. Using a decorator is an advantage when dealing with code defined in a third-party library. Pietro Mike On 06/28/2010 12:35 PM, Pietro Berkes wrote: Dear everybody, This message belongs only marginally to a numpy-related mailing list, but I thought it might be of interest here since it addresses what I believe is a common pattern in scientific development. My apologies if that is not the case... The code can be found at http://github.com/pberkes/persistent_locals and requires the byteplay library (http://code.google.com/p/byteplay/). The problem = In scientific development, functions often represent complex data processing algorithm that transform input data into a desired output. Internally, the function typically requires several intermediate results to be computed and stored in local variables. As a simple toy example, consider the following function, that takes three arguments and returns True if the sum of the arguments is smaller than their product: def is_sum_lt_prod(a,b,c): sum = a+b+c prod = a*b*c return sumprod A frequently occurring problem is that the developer/final user may need to access the intermediate results at a later stage, in order to analyze the detailed behavior of the algorithm, for debugging, or to write more comprehensive tests. A possible solution would be to re-define the function and return the needed internal variables, but this would break the existing code. A better solution would be to add a keyword argument to return more information: def is_sum_lt_prod(a,b,c, internals=False): sum = a+b+c prod = a*b*c if internals: return sumprod, sum, prod else: return sumprod This would keep the existing code intact, but only moves the problem to later stages of the development. If successively the developer needs access to even more local variables, the code has to be modified again, and part of the code is broken. Moreover, this style leads to ugly code like res, _, _, _, var1, _, var3 = f(x) where most of the returned values are irrelevant. Proposed solution = The proposed solution consists in a decorator that makes the local variables accessible from a function attribute, 'locals'. For example: @persistent_locals def is_sum_lt_prod(a,b,c): sum = a+b+c prod = a*b*c return sumprod After calling the function (e.g. is_sum_lt_prod(2,1,2), which returns False) we can analyze the intermediate results as is_sum_lt_prod.locals - {'a': 2, 'b': 1, 'c': 2, 'prod': 4, 'sum': 5} This style is cleaner, is consistent with the principle of identifying the value returned by a function as the output of an algorithm, and is robust to changes in the needs of the researcher. Note that the local variables are saved even in case of an exception, which turns out to be quite useful for debugging. How it works = The local variables in the inner scope of a function are not easily accessible. One solution (which I have not tried) may be to use tracing code like the one used in a debugger. This, however, would have a considerable cost in time. The proposed approach is to wrap the function in a callable object, and modify
Re: [Numpy-discussion] numpy.load raising IOError but EOFError expected
2010/6/28 Ruben Salvador rsalvador...@gmail.com: Thanks for the answer Friedrich, I had already checked numpy.savez, but unfortunately I cannot make use of it. I don't have all the data needed to be saved at the same time...it is produced each time I run a test. Yes, I thought of something like: all_data = numpy.load('file.npz') all_data[new_key] = new_data numpy.savez('file.npz', **all_data) Will this work? Friedrich ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.load raising IOError but EOFError expected
On 28 June 2010 10:52, Ruben Salvador rsalvador...@gmail.com wrote: Sorry I had no access during these days. Thanks for the answer Friedrich, I had already checked numpy.savez, but unfortunately I cannot make use of it. I don't have all the data needed to be saved at the same time...it is produced each time I run a test. I think people are uncomfortable because .npy files are not designed to contain more than one array. It's a bit like concatenating a whole lot of .PNG files together - while a good decoder could pick them apart again, it's a highly misleading file since the headers do not contain any information about all the other files. npy files are similarly self-describing, and so concatenating them is a peculiar sort of thing to do. Why not simply save a separate file each time, so that you have a directory full of files? Or, if you must have just one file, use np.savez (loading the old one each time then saving the expanded object). Come to think of it, it's possible to append files to an existing zip file without rewriting the whole thing. Does numpy.savez allow this mode? That said, good exception hygiene argues that np.load should throw EOFErrors rather than the more generic IOErrors, but I don't know how difficult this would be to achieve. Anne Thanks anyway! Any other idea why this is happening? Is it expected behavior? On Thu, Jun 24, 2010 at 7:30 PM, Friedrich Romstedt friedrichromst...@gmail.com wrote: 2010/6/23 Ruben Salvador rsalvador...@gmail.com: Therefore, is this a bug? Shouldn't EOFError be raised instead of IOError? Or am I missunderstanding something? If this is not a bug, how can I detect the EOF to stop reading (I expect a way for this to work without tweaking the code with saving first in the file the number of dumps done)? Maybe you can make use of numpy.savez, http://docs.scipy.org/doc/numpy/reference/generated/numpy.savez.html#numpy-savez . Friedrich ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Rubén Salvador PhD student @ Centro de Electrónica Industrial (CEI) http://www.cei.upm.es Blog: http://aesatcei.wordpress.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.load raising IOError but EOFError expected
ma, 2010-06-28 kello 15:48 -0400, Anne Archibald kirjoitti: [clip] That said, good exception hygiene argues that np.load should throw EOFErrors rather than the more generic IOErrors, but I don't know how difficult this would be to achieve. np.load is in any case unhygienic, since it tries to unpickle, if it doesn't see the .npy magic header. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.load raising IOError but EOFError expected
On Wed, Jun 23, 2010 at 3:46 AM, Ruben Salvador rsalvador...@gmail.com wrote: Hi there, I have a .npy file built by succesively adding results from different test runs of an algorithm. Each time it's run, I save a numpy.array using numpy.save as follows: fn = 'file.npy' f = open(fn, 'a+b') np.save(f, arr) f.close() How about using h5py? It's not part of numpy but it gives you a dictionary-like interface to your archive: import h5py io = h5py.File('/tmp/data.hdf5') arr1 = np.array([1, 2, 3]) arr2 = np.array([4, 5, 6]) arr3 = np.array([7, 8, 9]) io['arr1'] = arr1 io['arr2'] = arr2 io['arr3'] = arr3 io.keys() ['arr1', 'arr2', 'arr3'] io['arr1'][:] array([1, 2, 3]) You can also load part of an array (useful when the array is large): io['arr1'][-1] 3 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.load raising IOError but EOFError expected
2010/6/28 Keith Goodman kwgood...@gmail.com: How about using h5py? It's not part of numpy but it gives you a dictionary-like interface to your archive: Yeaa, or PyTables (is that equivalent)? It's also a hdf (or whatever, I don't recall precisely) interface. There were [ANN]s on the list about PyTables. Friedrich ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion