Re: [Numpy-discussion] Memory mapping and NPZ files
Mathieu Duboiswrote: > The point is precisely that, you can't do memory mapping with Npz files > (while it works with Npy files). The operating system can memory map any file. But as npz-files are compressed, you will need to uncompress the contents in your memory mapping to make sense of it. I would suggest you use PyTables instead of npz-files. It allows on the fly compression and uncompression (via blosc) and will probably do what you want. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy intermittent seg fault
Hi, On Fri, 11 Dec 2015 10:05:59 +1000 Jacopo Sabbatiniwrote: > > I'm experiencing random segmentation faults from numpy. I have generated a > core dumped and extracted a stack trace, the following: > > #0 0x7f3a8d921d5d in getenv () from /lib64/libc.so.6 > #1 0x7f3a843bde21 in blas_set_parameter () from > /opt/apps/sidescananalysis-9.7.1-42-gdd3e068+dev/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0 > #2 0x7f3a843bcd91 in blas_memory_alloc () from > /opt/apps/sidescananalysis-9.7.1-42-gdd3e068+dev/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0 > #3 0x7f3a843bd4e5 in blas_thread_server () from > /opt/apps/sidescananalysis-9.7.1-42-gdd3e068+dev/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0 > #4 0x7f3a8e09ff18 in start_thread () from /lib64/libpthread.so.0 > #5 0x7f3a8d9ceb2d in clone () from /lib64/libc.so.6 > > I have experience the segfault from several code paths but they all have > the same stack trace. > > I use conda to run python and numpy. The dump of the packages version is: In addition to openblas, you should also submit a bug to Anaconda so that they know of problems with that particular openblas version: https://github.com/ContinuumIO/anaconda-issues Regards Antoine. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy
>From time to time it is asked on forums how to extend precision of computation >on Numpy array. The most common answer given to this question is: use the dtype=object with some arbitrary precision module like mpmath or gmpy. See http://stackoverflow.com/questions/6876377/numpy-arbitrary-precision-linear-algebra or http://stackoverflow.com/questions/21165745/precision-loss-numpy-mpmath or http://stackoverflow.com/questions/15307589/numpy-array-with-mpz-mpfr-values While this is obviously the most relevant answer for many users because it will allow them to use Numpy arrays exactly as they would have used them with native types, the wrong thing is that from some point of view "true" vectorization will be lost. With years I got very familiar with the extended double-double type which has (for usual architectures) about 32 accurate digits with faster arithmetic than "arbitrary precision types". I even used it for research purpose in number theory and I got convinced that it is a very wonderful type as long as such precision is suitable. I often implemented it partially under Numpy, most of the time by trying to vectorize at a low-level the libqd library. But I recently thought that a very nice and portable way of implementing it under Numpy would be to use the existing layer of vectorization on floats for computing the arithmetic operations by "columns containing half of the numbers" rather than by "full numbers". As a proof of concept I wrote the following file: https://gist.github.com/baruchel/c86ed748939534d8910d I converted and vectorized the Algol 60 codes from http://szmoore.net/ipdf/documents/references/dekker1971afloating.pdf (Dekker, 1971). A test is provided at the end; for inverting 100,000 numbers, my type is about 3 or 4 times faster than GMPY and almost 50 times faster than MPmath. It should be even faster for some other operations since I had to create another np.ones array for testing this type because inversion isn't implemented here (which could of course be done). You can run this file by yourself (maybe you will have to discard mpmath or gmpy if you don't have it). I would like to discuss about the way to make available something related to that. a) Would it be relevant to include that in Numpy ? (I would think to some "contribution"-tool rather than including it in the core of Numpy because it would be painful to code all ufuncs; on the other hand I am pretty sure that many would be happy to perform several arithmetic operations by knowing that they can't use cos/sin/etc. on this type; in other words, I am not sure it would be a good idea to embed it as an every-day type but I think it would be nice to have it quickly available in some way). If you agree with that, in which way should I code it (the current link only is a "proof of concept"; I would be very happy to code it in some cleaner way)? b) Do you think such attempt should remain something external to Numpy itself and be released on my Github account without being integrated to Numpy? Best regards, -- Thomas Baruchel ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy
On Fri, Dec 11, 2015 at 11:22 AM, Anne Archibaldwrote: > Actually, GCC implements 128-bit floats in software and provides them as > __float128; there are also quad-precision versions of the usual functions. > The Intel compiler provides this as well, I think, but I don't think > Microsoft compilers do. A portable quad-precision library might be less > painful. > > The cleanest way to add extended precision to numpy is by adding a > C-implemented dtype. This can be done in an extension module; see the > quaternion and half-precision modules online. > > Anne > > > On Fri, Dec 11, 2015, 16:46 Charles R Harris > wrote: >> >> On Fri, Dec 11, 2015 at 6:25 AM, Thomas Baruchel wrote: >>> >>> From time to time it is asked on forums how to extend precision of >>> computation on Numpy array. The most common answer >>> given to this question is: use the dtype=object with some arbitrary >>> precision module like mpmath or gmpy. >>> See >>> http://stackoverflow.com/questions/6876377/numpy-arbitrary-precision-linear-algebra >>> or http://stackoverflow.com/questions/21165745/precision-loss-numpy-mpmath >>> or >>> http://stackoverflow.com/questions/15307589/numpy-array-with-mpz-mpfr-values >>> >>> While this is obviously the most relevant answer for many users because >>> it will allow them to use Numpy arrays exactly >>> as they would have used them with native types, the wrong thing is that >>> from some point of view "true" vectorization >>> will be lost. >>> >>> With years I got very familiar with the extended double-double type which >>> has (for usual architectures) about 32 accurate >>> digits with faster arithmetic than "arbitrary precision types". I even >>> used it for research purpose in number theory and >>> I got convinced that it is a very wonderful type as long as such >>> precision is suitable. >>> >>> I often implemented it partially under Numpy, most of the time by trying >>> to vectorize at a low-level the libqd library. >>> >>> But I recently thought that a very nice and portable way of implementing >>> it under Numpy would be to use the existing layer >>> of vectorization on floats for computing the arithmetic operations by >>> "columns containing half of the numbers" rather than >>> by "full numbers". As a proof of concept I wrote the following file: >>> https://gist.github.com/baruchel/c86ed748939534d8910d >>> >>> I converted and vectorized the Algol 60 codes from >>> http://szmoore.net/ipdf/documents/references/dekker1971afloating.pdf >>> (Dekker, 1971). >>> >>> A test is provided at the end; for inverting 100,000 numbers, my type is >>> about 3 or 4 times faster than GMPY and almost >>> 50 times faster than MPmath. It should be even faster for some other >>> operations since I had to create another np.ones >>> array for testing this type because inversion isn't implemented here >>> (which could of course be done). You can run this file by yourself >>> (maybe you will have to discard mpmath or gmpy if you don't have it). >>> >>> I would like to discuss about the way to make available something related >>> to that. >>> >>> a) Would it be relevant to include that in Numpy ? (I would think to some >>> "contribution"-tool rather than including it in >>> the core of Numpy because it would be painful to code all ufuncs; on the >>> other hand I am pretty sure that many would be happy >>> to perform several arithmetic operations by knowing that they can't use >>> cos/sin/etc. on this type; in other words, I am not >>> sure it would be a good idea to embed it as an every-day type but I think >>> it would be nice to have it quickly available >>> in some way). If you agree with that, in which way should I code it (the >>> current link only is a "proof of concept"; I would >>> be very happy to code it in some cleaner way)? >>> >>> b) Do you think such attempt should remain something external to Numpy >>> itself and be released on my Github account without being >>> integrated to Numpy? >> >> >> I think astropy does something similar for time and dates. There has also >> been some talk of adding a user type for ieee 128 bit doubles. I've looked >> once for relevant code for the latter and, IIRC, the available packages were >> GPL :(. This might be the same as or similar to a recent announcement for Julia https://groups.google.com/d/msg/julia-users/iHTaxRVj1yM/M-WtZCedCQAJ It would be useful to get this in a consistent way across platforms and compilers. I can think of several applications where higher precision reduce operations would be useful in statistics. As Windows user, I never even saw a higher precision float. Josef >> >> Chuck >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org >
Re: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy
Actually, GCC implements 128-bit floats in software and provides them as __float128; there are also quad-precision versions of the usual functions. The Intel compiler provides this as well, I think, but I don't think Microsoft compilers do. A portable quad-precision library might be less painful. The cleanest way to add extended precision to numpy is by adding a C-implemented dtype. This can be done in an extension module; see the quaternion and half-precision modules online. Anne On Fri, Dec 11, 2015, 16:46 Charles R Harriswrote: > On Fri, Dec 11, 2015 at 6:25 AM, Thomas Baruchel wrote: > >> From time to time it is asked on forums how to extend precision of >> computation on Numpy array. The most common answer >> given to this question is: use the dtype=object with some arbitrary >> precision module like mpmath or gmpy. >> See >> http://stackoverflow.com/questions/6876377/numpy-arbitrary-precision-linear-algebra >> or >> http://stackoverflow.com/questions/21165745/precision-loss-numpy-mpmath >> or >> http://stackoverflow.com/questions/15307589/numpy-array-with-mpz-mpfr-values >> >> While this is obviously the most relevant answer for many users because >> it will allow them to use Numpy arrays exactly >> as they would have used them with native types, the wrong thing is that >> from some point of view "true" vectorization >> will be lost. >> >> With years I got very familiar with the extended double-double type which >> has (for usual architectures) about 32 accurate >> digits with faster arithmetic than "arbitrary precision types". I even >> used it for research purpose in number theory and >> I got convinced that it is a very wonderful type as long as such >> precision is suitable. >> >> I often implemented it partially under Numpy, most of the time by trying >> to vectorize at a low-level the libqd library. >> >> But I recently thought that a very nice and portable way of implementing >> it under Numpy would be to use the existing layer >> of vectorization on floats for computing the arithmetic operations by >> "columns containing half of the numbers" rather than >> by "full numbers". As a proof of concept I wrote the following file: >> https://gist.github.com/baruchel/c86ed748939534d8910d >> >> I converted and vectorized the Algol 60 codes from >> http://szmoore.net/ipdf/documents/references/dekker1971afloating.pdf >> (Dekker, 1971). >> >> A test is provided at the end; for inverting 100,000 numbers, my type is >> about 3 or 4 times faster than GMPY and almost >> 50 times faster than MPmath. It should be even faster for some other >> operations since I had to create another np.ones >> array for testing this type because inversion isn't implemented here >> (which could of course be done). You can run this file by yourself >> (maybe you will have to discard mpmath or gmpy if you don't have it). >> >> I would like to discuss about the way to make available something related >> to that. >> >> a) Would it be relevant to include that in Numpy ? (I would think to some >> "contribution"-tool rather than including it in >> the core of Numpy because it would be painful to code all ufuncs; on the >> other hand I am pretty sure that many would be happy >> to perform several arithmetic operations by knowing that they can't use >> cos/sin/etc. on this type; in other words, I am not >> sure it would be a good idea to embed it as an every-day type but I think >> it would be nice to have it quickly available >> in some way). If you agree with that, in which way should I code it (the >> current link only is a "proof of concept"; I would >> be very happy to code it in some cleaner way)? >> >> b) Do you think such attempt should remain something external to Numpy >> itself and be released on my Github account without being >> integrated to Numpy? >> > > I think astropy does something similar for time and dates. There has also > been some talk of adding a user type for ieee 128 bit doubles. I've looked > once for relevant code for the latter and, IIRC, the available packages > were GPL :(. > > Chuck > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy
On Fri, Dec 11, 2015 at 4:22 PM, Anne Archibaldwrote: > Actually, GCC implements 128-bit floats in software and provides them as > __float128; there are also quad-precision versions of the usual functions. > The Intel compiler provides this as well, I think, but I don't think > Microsoft compilers do. A portable quad-precision library might be less > painful. > > The cleanest way to add extended precision to numpy is by adding a > C-implemented dtype. This can be done in an extension module; see the > quaternion and half-precision modules online. > We actually used __float128 dtype as an example of how to create a custom dtype for a numpy C tutorial we did w/ Stefan Van der Walt a few years ago at SciPy. IIRC, one of the issue to make it more than a PoC was that numpy hardcoded things like long double being the higest precision, etc... But that may has been fixed since then. David > Anne > > On Fri, Dec 11, 2015, 16:46 Charles R Harris > wrote: > >> On Fri, Dec 11, 2015 at 6:25 AM, Thomas Baruchel >> wrote: >> >>> From time to time it is asked on forums how to extend precision of >>> computation on Numpy array. The most common answer >>> given to this question is: use the dtype=object with some arbitrary >>> precision module like mpmath or gmpy. >>> See >>> http://stackoverflow.com/questions/6876377/numpy-arbitrary-precision-linear-algebra >>> or >>> http://stackoverflow.com/questions/21165745/precision-loss-numpy-mpmath >>> or >>> http://stackoverflow.com/questions/15307589/numpy-array-with-mpz-mpfr-values >>> >>> While this is obviously the most relevant answer for many users because >>> it will allow them to use Numpy arrays exactly >>> as they would have used them with native types, the wrong thing is that >>> from some point of view "true" vectorization >>> will be lost. >>> >>> With years I got very familiar with the extended double-double type >>> which has (for usual architectures) about 32 accurate >>> digits with faster arithmetic than "arbitrary precision types". I even >>> used it for research purpose in number theory and >>> I got convinced that it is a very wonderful type as long as such >>> precision is suitable. >>> >>> I often implemented it partially under Numpy, most of the time by trying >>> to vectorize at a low-level the libqd library. >>> >>> But I recently thought that a very nice and portable way of implementing >>> it under Numpy would be to use the existing layer >>> of vectorization on floats for computing the arithmetic operations by >>> "columns containing half of the numbers" rather than >>> by "full numbers". As a proof of concept I wrote the following file: >>> https://gist.github.com/baruchel/c86ed748939534d8910d >>> >>> I converted and vectorized the Algol 60 codes from >>> http://szmoore.net/ipdf/documents/references/dekker1971afloating.pdf >>> (Dekker, 1971). >>> >>> A test is provided at the end; for inverting 100,000 numbers, my type is >>> about 3 or 4 times faster than GMPY and almost >>> 50 times faster than MPmath. It should be even faster for some other >>> operations since I had to create another np.ones >>> array for testing this type because inversion isn't implemented here >>> (which could of course be done). You can run this file by yourself >>> (maybe you will have to discard mpmath or gmpy if you don't have it). >>> >>> I would like to discuss about the way to make available something >>> related to that. >>> >>> a) Would it be relevant to include that in Numpy ? (I would think to >>> some "contribution"-tool rather than including it in >>> the core of Numpy because it would be painful to code all ufuncs; on the >>> other hand I am pretty sure that many would be happy >>> to perform several arithmetic operations by knowing that they can't use >>> cos/sin/etc. on this type; in other words, I am not >>> sure it would be a good idea to embed it as an every-day type but I >>> think it would be nice to have it quickly available >>> in some way). If you agree with that, in which way should I code it (the >>> current link only is a "proof of concept"; I would >>> be very happy to code it in some cleaner way)? >>> >>> b) Do you think such attempt should remain something external to Numpy >>> itself and be released on my Github account without being >>> integrated to Numpy? >>> >> >> I think astropy does something similar for time and dates. There has also >> been some talk of adding a user type for ieee 128 bit doubles. I've looked >> once for relevant code for the latter and, IIRC, the available packages >> were GPL :(. >> >> Chuck >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > >
Re: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy
> There has also been some talk of adding a user type for ieee 128 bit doubles. > I've looked once for relevant code for the latter and, IIRC, the available > packages were GPL :(. This looks like it's BSD-Ish: http://www.jhauser.us/arithmetic/SoftFloat.html Don't know if it's any good CHB > > Chuck > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ANN: pyMIC v0.7 Released
Announcement: pyMIC v0.7 = I'm happy to announce the release of pyMIC v0.7. pyMIC is a Python module to offload computation in a Python program to the Intel Xeon Phi coprocessor. It contains offloadable arrays and device management functions. It supports invocation of native kernels (C/C++, Fortran) and blends in with Numpy's array types for float, complex, and int data types. For more information and downloads please visit pyMIC's Github page: https://github.com/01org/pyMIC. You can find pyMIC's mailinglist at https://lists.01.org/mailman/listinfo/pymic. Full change log: = Version 0.7 * Experimental support for Python 3. * 'None' arguments of kernels are converted to nullptr or NULL. * Switched to Python's distutils to build and install pyMIC. * Deprecated the build system based on Makefiles. Version 0.6 * Experimental support for the Windows operating system. * Switched to Cython to generate the glue code for pyMIC. * Now using Markdown for README and CHANGELOG. * Introduced PYMIC_DEBUG=3 to trace argument passing for kernels. * Bugfix: added back the translate_device_pointer() function. * Bugfix: example SVD now respects order of the passed matrices when applying the `dgemm` routine. * Bugfix: fixed memory leak when invoking kernels. * Bugfix: fixed broken translation of fake pointers. * Refactoring: simplified bridge between pyMIC and LIBXSTREAM. Version 0.5 * Introduced new kernel API that avoids insane pointer unpacking. * pyMIC now uses libxstreams as the offload back-end (https://github.com/hfp/libxstream). * Added smart pointers to make handling of fake pointers easier. Version 0.4 * New low-level API to allocate, deallocate, and transfer data (see OffloadStream). * Support for in-place binary operators. * New internal design to handle offloads. Version 0.3 * Improved handling of libraries and kernel invocation. * Trace collection (PYMIC_TRACE=1, PYMIC_TRACE_STACKS={none,compact,full}). * Replaced the device-centric API with a stream API. * Refactoring to better match PEP8 recommendations. * Added support for int(int64) and complex(complex128) data types. * Reworked the benchmarks and examples to fit the new API. * Bugfix: fixed syntax errors in OffloadArray. Version 0.2 * Small improvements to the README files. * New example: Singular Value Decomposition. * Some documentation for the API functions. * Added a basic testsuite for unit testing (WIP). * Bugfix: benchmarks now use the latest interface. * Bugfix: numpy.ndarray does not offer an attribute 'order'. * Bugfix: number_of_devices was not visible after import. * Bugfix: member offload_array.device is now initialized. * Bugfix: use exception for errors w/ invoke_kernel & load_library. Version 0.1 Initial release. Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Christian Lamprechter Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Memory mapping and NPZ files
On Wed, Dec 9, 2015 at 9:51 AM, Mathieu Duboiswrote: > Dear all, > > If I am correct, using mmap_mode with Npz files has no effect i.e.: > f = np.load("data.npz", mmap_mode="r") > X = f['X'] > will load all the data in memory. > > Can somebody confirm that? > > If I'm correct, the mmap_mode argument could be passed to the NpzFile class > which could in turn perform the correct operation. One way to handle that > would be to use the ZipFile.extract method to write the Npy file on disk and > then load it with numpy.load with the mmap_mode argument. Note that the user > will have to remove the file to reclaim disk space (I guess that's OK). > > One problem that could arise is that the extracted Npy file can be large > (it's the purpose of using memory mapping) and therefore it may be useful to > offer some control on where this file is extracted (for instance /tmp can be > too small to extract the file here). numpy.load could offer a new option for > that (passed to ZipFile.extract). I have struggled for a long time with a similar (albeit more obscure problem) with PyFITS / astropy.io.fits when it comes to supporting memory-mapping of compressed FITS files. For those unaware FITS is a file format used primarily in Astronomy. I have all kinds of wacky ideas for optimizing this, but at the moment when you load data from a compressed FITS file with memory-mapping enabled, obviously there's not much benefit because the contents of the file are uncompressed in memory (there is a *little* benefit in that the compressed data is mmap'd, but the compressed data is typically much smaller than the uncompressed data). Currently, in this case, I just issue a warning when the user explicitly requests mmap=True, but won't get much benefit from it. Maybe np.load could do the same, but I don't have a strong opinion about it. (I only added the warning in PyFITS because a user requested it and was kind enough to provide a patch--seemed reasonable). Erik ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] FeatureRequest: support for array construction from iterators
Constructing an array from an iterator is fundamentally different from constructing an array from an in-memory data structure like a list, because in the iterator case it's necessary to either use a single-pass algorithm or else create extra temporary buffers that cause much higher memory overhead. (Which is undesirable given that iterators are mostly used exactly in the case where one wants to reduce memory overhead.) np.fromiter requires the dtype= argument because this is necessary if you want to construct the array in a single pass. np.array(list(iter)) can avoid the dtype argument, because it creates that large memory buffer. IMO this is better than making np.array(iter) internally call list(iter) or equivalent, because the workaround (adding an explicit call to list()) is trivial, while also making it obvious to the user what the actual cost of their request is. (Explicit is better than implicit.) In addition, the proposed API has a number of infelicities: - We're generally trying to *reduce* the magic in functions like np.array (e.g. the discussions of having less magic for lists with mismatched numbers of elements, or non-list sequences) - There's a strong convention in Python is when making a function like np.array generic, it should accept any iter*able* rather any iter*ator*. But it would be super confusing if np.array({1: 2}) returned array([1]), or if array("foo") returned array(["f", "o", "o"]), so we don't actually want to handle all iterables the same. It's somewhat dubious even for iterators (e.g. someone might want to create an object array containing an iterator...)... hope that helps, -n On Fri, Dec 11, 2015 at 2:27 PM, Stephan Sahmwrote: > numpy.fromiter is neither numpy.array nor does it work similar to > numpy.array(list(...)) as the dtype argument is necessary > > is there a reason, why np.array(...) should not work on iterators? I have > the feeling that such requests get (repeatedly) dismissed, but until yet I > haven't found a compelling argument for leaving this Feature missing (to > remember, it is already implemented in a branch) > > Please let me know if you know about an argument, > best, > Stephan > > On 27 November 2015 at 14:18, Alan G Isaac wrote: >> >> On 11/27/2015 5:37 AM, Stephan Sahm wrote: >>> >>> I like to request a generator/iterator support for np.array(...) as far >>> as list(...) supports it. >> >> >> >> http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromiter.html >> >> hth, >> Alan Isaac >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Nathaniel J. Smith -- http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] FeatureRequest: support for array construction from iterators
numpy.fromiter is neither numpy.array nor does it work similar to numpy.array(list(...)) as the dtype argument is necessary is there a reason, why np.array(...) should not work on iterators? I have the feeling that such requests get (repeatedly) dismissed, but until yet I haven't found a compelling argument for leaving this Feature missing (to remember, it is already implemented in a branch) Please let me know if you know about an argument, best, Stephan On 27 November 2015 at 14:18, Alan G Isaacwrote: > On 11/27/2015 5:37 AM, Stephan Sahm wrote: > >> I like to request a generator/iterator support for np.array(...) as far >> as list(...) supports it. >> > > > http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromiter.html > > hth, > Alan Isaac > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] FeatureRequest: support for array construction from iterators
Nathaniel, > IMO this is better than making np.array(iter) internally call list(iter) or equivalent Yeah but that's not the only option: from itertools import chain def fromiter_awesome_edition(iterable): elem = next(iterable) dtype = whatever_numpy_does_to_infer_dtypes_from_lists(elem) return np.fromiter(chain([elem], iterable), dtype=dtype) I think this would be a huge win for usability. Always getting tripped up by the dtype requirement. I can submit a PR if people like this pattern. btw, I think np.array(['f', 'o', 'o']) would be exactly the expected result for np.array('foo'), but I guess that's just me. Juan. On Sat, Dec 12, 2015 at 10:12 AM, Nathaniel Smithwrote: > Constructing an array from an iterator is fundamentally different from > constructing an array from an in-memory data structure like a list, > because in the iterator case it's necessary to either use a > single-pass algorithm or else create extra temporary buffers that > cause much higher memory overhead. (Which is undesirable given that > iterators are mostly used exactly in the case where one wants to > reduce memory overhead.) > > np.fromiter requires the dtype= argument because this is necessary if > you want to construct the array in a single pass. > > np.array(list(iter)) can avoid the dtype argument, because it creates > that large memory buffer. IMO this is better than making > np.array(iter) internally call list(iter) or equivalent, because the > workaround (adding an explicit call to list()) is trivial, while also > making it obvious to the user what the actual cost of their request > is. (Explicit is better than implicit.) > > In addition, the proposed API has a number of infelicities: > - We're generally trying to *reduce* the magic in functions like > np.array (e.g. the discussions of having less magic for lists with > mismatched numbers of elements, or non-list sequences) > - There's a strong convention in Python is when making a function like > np.array generic, it should accept any iter*able* rather any > iter*ator*. But it would be super confusing if np.array({1: 2}) > returned array([1]), or if array("foo") returned array(["f", "o", > "o"]), so we don't actually want to handle all iterables the same. > It's somewhat dubious even for iterators (e.g. someone might want to > create an object array containing an iterator...)... > > hope that helps, > -n > > On Fri, Dec 11, 2015 at 2:27 PM, Stephan Sahm wrote: > > numpy.fromiter is neither numpy.array nor does it work similar to > > numpy.array(list(...)) as the dtype argument is necessary > > > > is there a reason, why np.array(...) should not work on iterators? I have > > the feeling that such requests get (repeatedly) dismissed, but until yet > I > > haven't found a compelling argument for leaving this Feature missing (to > > remember, it is already implemented in a branch) > > > > Please let me know if you know about an argument, > > best, > > Stephan > > > > On 27 November 2015 at 14:18, Alan G Isaac wrote: > >> > >> On 11/27/2015 5:37 AM, Stephan Sahm wrote: > >>> > >>> I like to request a generator/iterator support for np.array(...) as far > >>> as list(...) supports it. > >> > >> > >> > >> http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromiter.html > >> > >> hth, > >> Alan Isaac > >> ___ > >> NumPy-Discussion mailing list > >> NumPy-Discussion@scipy.org > >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > Nathaniel J. Smith -- http://vorpus.org > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy
On Fri, Dec 11, 2015 at 6:25 AM, Thomas Baruchelwrote: > From time to time it is asked on forums how to extend precision of > computation on Numpy array. The most common answer > given to this question is: use the dtype=object with some arbitrary > precision module like mpmath or gmpy. > See > http://stackoverflow.com/questions/6876377/numpy-arbitrary-precision-linear-algebra > or http://stackoverflow.com/questions/21165745/precision-loss-numpy-mpmath > or > http://stackoverflow.com/questions/15307589/numpy-array-with-mpz-mpfr-values > > While this is obviously the most relevant answer for many users because it > will allow them to use Numpy arrays exactly > as they would have used them with native types, the wrong thing is that > from some point of view "true" vectorization > will be lost. > > With years I got very familiar with the extended double-double type which > has (for usual architectures) about 32 accurate > digits with faster arithmetic than "arbitrary precision types". I even > used it for research purpose in number theory and > I got convinced that it is a very wonderful type as long as such precision > is suitable. > > I often implemented it partially under Numpy, most of the time by trying > to vectorize at a low-level the libqd library. > > But I recently thought that a very nice and portable way of implementing > it under Numpy would be to use the existing layer > of vectorization on floats for computing the arithmetic operations by > "columns containing half of the numbers" rather than > by "full numbers". As a proof of concept I wrote the following file: > https://gist.github.com/baruchel/c86ed748939534d8910d > > I converted and vectorized the Algol 60 codes from > http://szmoore.net/ipdf/documents/references/dekker1971afloating.pdf > (Dekker, 1971). > > A test is provided at the end; for inverting 100,000 numbers, my type is > about 3 or 4 times faster than GMPY and almost > 50 times faster than MPmath. It should be even faster for some other > operations since I had to create another np.ones > array for testing this type because inversion isn't implemented here > (which could of course be done). You can run this file by yourself > (maybe you will have to discard mpmath or gmpy if you don't have it). > > I would like to discuss about the way to make available something related > to that. > > a) Would it be relevant to include that in Numpy ? (I would think to some > "contribution"-tool rather than including it in > the core of Numpy because it would be painful to code all ufuncs; on the > other hand I am pretty sure that many would be happy > to perform several arithmetic operations by knowing that they can't use > cos/sin/etc. on this type; in other words, I am not > sure it would be a good idea to embed it as an every-day type but I think > it would be nice to have it quickly available > in some way). If you agree with that, in which way should I code it (the > current link only is a "proof of concept"; I would > be very happy to code it in some cleaner way)? > > b) Do you think such attempt should remain something external to Numpy > itself and be released on my Github account without being > integrated to Numpy? > I think astropy does something similar for time and dates. There has also been some talk of adding a user type for ieee 128 bit doubles. I've looked once for relevant code for the latter and, IIRC, the available packages were GPL :(. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy
On Dec 11, 2015 7:46 AM, "Charles R Harris"wrote: > > > > On Fri, Dec 11, 2015 at 6:25 AM, Thomas Baruchel wrote: >> >> From time to time it is asked on forums how to extend precision of computation on Numpy array. The most common answer >> given to this question is: use the dtype=object with some arbitrary precision module like mpmath or gmpy. >> See http://stackoverflow.com/questions/6876377/numpy-arbitrary-precision-linear-algebra or http://stackoverflow.com/questions/21165745/precision-loss-numpy-mpmath or http://stackoverflow.com/questions/15307589/numpy-array-with-mpz-mpfr-values >> >> While this is obviously the most relevant answer for many users because it will allow them to use Numpy arrays exactly >> as they would have used them with native types, the wrong thing is that from some point of view "true" vectorization >> will be lost. >> >> With years I got very familiar with the extended double-double type which has (for usual architectures) about 32 accurate >> digits with faster arithmetic than "arbitrary precision types". I even used it for research purpose in number theory and >> I got convinced that it is a very wonderful type as long as such precision is suitable. >> >> I often implemented it partially under Numpy, most of the time by trying to vectorize at a low-level the libqd library. >> >> But I recently thought that a very nice and portable way of implementing it under Numpy would be to use the existing layer >> of vectorization on floats for computing the arithmetic operations by "columns containing half of the numbers" rather than >> by "full numbers". As a proof of concept I wrote the following file: https://gist.github.com/baruchel/c86ed748939534d8910d >> >> I converted and vectorized the Algol 60 codes from http://szmoore.net/ipdf/documents/references/dekker1971afloating.pdf >> (Dekker, 1971). >> >> A test is provided at the end; for inverting 100,000 numbers, my type is about 3 or 4 times faster than GMPY and almost >> 50 times faster than MPmath. It should be even faster for some other operations since I had to create another np.ones >> array for testing this type because inversion isn't implemented here (which could of course be done). You can run this file by yourself >> (maybe you will have to discard mpmath or gmpy if you don't have it). >> >> I would like to discuss about the way to make available something related to that. >> >> a) Would it be relevant to include that in Numpy ? (I would think to some "contribution"-tool rather than including it in >> the core of Numpy because it would be painful to code all ufuncs; on the other hand I am pretty sure that many would be happy >> to perform several arithmetic operations by knowing that they can't use cos/sin/etc. on this type; in other words, I am not >> sure it would be a good idea to embed it as an every-day type but I think it would be nice to have it quickly available >> in some way). If you agree with that, in which way should I code it (the current link only is a "proof of concept"; I would >> be very happy to code it in some cleaner way)? >> >> b) Do you think such attempt should remain something external to Numpy itself and be released on my Github account without being >> integrated to Numpy? > > > I think astropy does something similar for time and dates. There has also been some talk of adding a user type for ieee 128 bit doubles. I've looked once for relevant code for the latter and, IIRC, the available packages were GPL :(. You're probably thinking of the __float128 support in gcc, which relies on a LGPL (not GPL) runtime support library. (LGPL = any patches to the support library itself need to remain open source, but no restrictions are imposed on code that merely uses it.) Still, probably something that should be done outside of numpy itself for now. -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy
On Fri, Dec 11, 2015 at 10:45 AM, Nathaniel Smithwrote: > On Dec 11, 2015 7:46 AM, "Charles R Harris" > wrote: > > > > > > > > On Fri, Dec 11, 2015 at 6:25 AM, Thomas Baruchel > wrote: > >> > >> From time to time it is asked on forums how to extend precision of > computation on Numpy array. The most common answer > >> given to this question is: use the dtype=object with some arbitrary > precision module like mpmath or gmpy. > >> See > http://stackoverflow.com/questions/6876377/numpy-arbitrary-precision-linear-algebra > or http://stackoverflow.com/questions/21165745/precision-loss-numpy-mpmath > or > http://stackoverflow.com/questions/15307589/numpy-array-with-mpz-mpfr-values > >> > >> While this is obviously the most relevant answer for many users because > it will allow them to use Numpy arrays exactly > >> as they would have used them with native types, the wrong thing is that > from some point of view "true" vectorization > >> will be lost. > >> > >> With years I got very familiar with the extended double-double type > which has (for usual architectures) about 32 accurate > >> digits with faster arithmetic than "arbitrary precision types". I even > used it for research purpose in number theory and > >> I got convinced that it is a very wonderful type as long as such > precision is suitable. > >> > >> I often implemented it partially under Numpy, most of the time by > trying to vectorize at a low-level the libqd library. > >> > >> But I recently thought that a very nice and portable way of > implementing it under Numpy would be to use the existing layer > >> of vectorization on floats for computing the arithmetic operations by > "columns containing half of the numbers" rather than > >> by "full numbers". As a proof of concept I wrote the following file: > https://gist.github.com/baruchel/c86ed748939534d8910d > >> > >> I converted and vectorized the Algol 60 codes from > http://szmoore.net/ipdf/documents/references/dekker1971afloating.pdf > >> (Dekker, 1971). > >> > >> A test is provided at the end; for inverting 100,000 numbers, my type > is about 3 or 4 times faster than GMPY and almost > >> 50 times faster than MPmath. It should be even faster for some other > operations since I had to create another np.ones > >> array for testing this type because inversion isn't implemented here > (which could of course be done). You can run this file by yourself > >> (maybe you will have to discard mpmath or gmpy if you don't have it). > >> > >> I would like to discuss about the way to make available something > related to that. > >> > >> a) Would it be relevant to include that in Numpy ? (I would think to > some "contribution"-tool rather than including it in > >> the core of Numpy because it would be painful to code all ufuncs; on > the other hand I am pretty sure that many would be happy > >> to perform several arithmetic operations by knowing that they can't use > cos/sin/etc. on this type; in other words, I am not > >> sure it would be a good idea to embed it as an every-day type but I > think it would be nice to have it quickly available > >> in some way). If you agree with that, in which way should I code it > (the current link only is a "proof of concept"; I would > >> be very happy to code it in some cleaner way)? > >> > >> b) Do you think such attempt should remain something external to Numpy > itself and be released on my Github account without being > >> integrated to Numpy? > > > > > > I think astropy does something similar for time and dates. There has > also been some talk of adding a user type for ieee 128 bit doubles. I've > looked once for relevant code for the latter and, IIRC, the available > packages were GPL :(. > > You're probably thinking of the __float128 support in gcc, which relies on > a LGPL (not GPL) runtime support library. (LGPL = any patches to the > support library itself need to remain open source, but no restrictions are > imposed on code that merely uses it.) > > Still, probably something that should be done outside of numpy itself for > now. > No, there are several other software packages out there. I know of the gcc version, but was looking for something more portable. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy
I have a mostly complete wrapping of the double-double type from the QD library (http://crd-legacy.lbl.gov/~dhbailey/mpdist/) into a numpy dtype. The real problem is, as david pointed out, user dtypes aren't quite full equivalents of the builtin dtypes. I can post the code if there is interest. Something along the lines of what's being discussed here would be nice, since the extended type is subject to such variation. Eric On Fri, Dec 11, 2015 at 12:51 PM, Charles R Harris < charlesr.har...@gmail.com> wrote: > > > On Fri, Dec 11, 2015 at 10:45 AM, Nathaniel Smithwrote: > >> On Dec 11, 2015 7:46 AM, "Charles R Harris" >> wrote: >> > >> > >> > >> > On Fri, Dec 11, 2015 at 6:25 AM, Thomas Baruchel >> wrote: >> >> >> >> From time to time it is asked on forums how to extend precision of >> computation on Numpy array. The most common answer >> >> given to this question is: use the dtype=object with some arbitrary >> precision module like mpmath or gmpy. >> >> See >> http://stackoverflow.com/questions/6876377/numpy-arbitrary-precision-linear-algebra >> or >> http://stackoverflow.com/questions/21165745/precision-loss-numpy-mpmath >> or >> http://stackoverflow.com/questions/15307589/numpy-array-with-mpz-mpfr-values >> >> >> >> While this is obviously the most relevant answer for many users >> because it will allow them to use Numpy arrays exactly >> >> as they would have used them with native types, the wrong thing is >> that from some point of view "true" vectorization >> >> will be lost. >> >> >> >> With years I got very familiar with the extended double-double type >> which has (for usual architectures) about 32 accurate >> >> digits with faster arithmetic than "arbitrary precision types". I even >> used it for research purpose in number theory and >> >> I got convinced that it is a very wonderful type as long as such >> precision is suitable. >> >> >> >> I often implemented it partially under Numpy, most of the time by >> trying to vectorize at a low-level the libqd library. >> >> >> >> But I recently thought that a very nice and portable way of >> implementing it under Numpy would be to use the existing layer >> >> of vectorization on floats for computing the arithmetic operations by >> "columns containing half of the numbers" rather than >> >> by "full numbers". As a proof of concept I wrote the following file: >> https://gist.github.com/baruchel/c86ed748939534d8910d >> >> >> >> I converted and vectorized the Algol 60 codes from >> http://szmoore.net/ipdf/documents/references/dekker1971afloating.pdf >> >> (Dekker, 1971). >> >> >> >> A test is provided at the end; for inverting 100,000 numbers, my type >> is about 3 or 4 times faster than GMPY and almost >> >> 50 times faster than MPmath. It should be even faster for some other >> operations since I had to create another np.ones >> >> array for testing this type because inversion isn't implemented here >> (which could of course be done). You can run this file by yourself >> >> (maybe you will have to discard mpmath or gmpy if you don't have it). >> >> >> >> I would like to discuss about the way to make available something >> related to that. >> >> >> >> a) Would it be relevant to include that in Numpy ? (I would think to >> some "contribution"-tool rather than including it in >> >> the core of Numpy because it would be painful to code all ufuncs; on >> the other hand I am pretty sure that many would be happy >> >> to perform several arithmetic operations by knowing that they can't >> use cos/sin/etc. on this type; in other words, I am not >> >> sure it would be a good idea to embed it as an every-day type but I >> think it would be nice to have it quickly available >> >> in some way). If you agree with that, in which way should I code it >> (the current link only is a "proof of concept"; I would >> >> be very happy to code it in some cleaner way)? >> >> >> >> b) Do you think such attempt should remain something external to Numpy >> itself and be released on my Github account without being >> >> integrated to Numpy? >> > >> > >> > I think astropy does something similar for time and dates. There has >> also been some talk of adding a user type for ieee 128 bit doubles. I've >> looked once for relevant code for the latter and, IIRC, the available >> packages were GPL :(. >> >> You're probably thinking of the __float128 support in gcc, which relies >> on a LGPL (not GPL) runtime support library. (LGPL = any patches to the >> support library itself need to remain open source, but no restrictions are >> imposed on code that merely uses it.) >> >> Still, probably something that should be done outside of numpy itself for >> now. >> > > No, there are several other software packages out there. I know of the gcc > version, but was looking for something more portable. > > Chuck > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org >