Re: [Numpy-discussion] building numpy 1.6.2 on OSX 10.6 / Python2.7.3
On Wed, Aug 8, 2012 at 6:15 AM, Andrew Nelson andyf...@gmail.com wrote: Dear Pierre, as indicated yesterday OSX system python is in: /System/Library/Frameworks/Python.framework/ I am installing into: /Library/Frameworks/Python.framework/Versions/Current/lib/python2.7/site-packages This should not present a problem and does not explain why numpy does not build/import correctly on my setup. Please give us the build log (when rebuilding from scratch to have the complete log) so that we can have a better idea of the issue, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Licensing question
On Wed, Aug 8, 2012 at 12:55 AM, Nathaniel Smith n...@pobox.com wrote: On Mon, Aug 6, 2012 at 8:31 PM, Robert Kern robert.k...@gmail.com wrote: Those are not the original Fortran sources. The original Fortran sources are in the public domain as work done by a US federal employee. http://www.netlib.org/fftpack/ Never trust the license of any code on John Burkardt's site. Track it down to the original sources. Taken together, what those websites seem to be claiming is that you have a choice of buggy BSD code or fixed GPL code? I assume someone has already taken the appropriate measures for numpy, but it seems like an unfortunate situation... If the code on John Burkardt website is based on the netlib codebase, he is not entitled to make it GPL unless he is the sole copyright holder of the original code. I think the 'real' solution is to have a separate package linking to FFTW for people with 'advanced' needs for FFT. None of the other library I have looked at so far are usable, fast and precise enough when you go far from the simple case of double precision and 'well factored' size. regards, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Licensing question
On Wed, Aug 8, 2012 at 10:34 AM, David Cournapeau courn...@gmail.com wrote: On Wed, Aug 8, 2012 at 12:55 AM, Nathaniel Smith n...@pobox.com wrote: On Mon, Aug 6, 2012 at 8:31 PM, Robert Kern robert.k...@gmail.com wrote: Those are not the original Fortran sources. The original Fortran sources are in the public domain as work done by a US federal employee. http://www.netlib.org/fftpack/ Never trust the license of any code on John Burkardt's site. Track it down to the original sources. Taken together, what those websites seem to be claiming is that you have a choice of buggy BSD code or fixed GPL code? I assume someone has already taken the appropriate measures for numpy, but it seems like an unfortunate situation... If the code on John Burkardt website is based on the netlib codebase, he is not entitled to make it GPL unless he is the sole copyright holder of the original code. He can certainly incorporate the public domain code and rerelease it under whatever restrictions he likes, especially if he adds to it, which appears to be the case. The original sources are legitimately public domain, not just released under a liberal copyright license. He can't remove the original code from the public domain, but that's not what he claims to have done. I think the 'real' solution is to have a separate package linking to FFTW for people with 'advanced' needs for FFT. None of the other library I have looked at so far are usable, fast and precise enough when you go far from the simple case of double precision and 'well factored' size. http://pypi.python.org/pypi/pyFFTW -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Licensing question
On Wed, Aug 8, 2012 at 10:53 AM, Robert Kern robert.k...@gmail.com wrote: On Wed, Aug 8, 2012 at 10:34 AM, David Cournapeau courn...@gmail.com wrote: On Wed, Aug 8, 2012 at 12:55 AM, Nathaniel Smith n...@pobox.com wrote: On Mon, Aug 6, 2012 at 8:31 PM, Robert Kern robert.k...@gmail.com wrote: Those are not the original Fortran sources. The original Fortran sources are in the public domain as work done by a US federal employee. http://www.netlib.org/fftpack/ Never trust the license of any code on John Burkardt's site. Track it down to the original sources. Taken together, what those websites seem to be claiming is that you have a choice of buggy BSD code or fixed GPL code? I assume someone has already taken the appropriate measures for numpy, but it seems like an unfortunate situation... If the code on John Burkardt website is based on the netlib codebase, he is not entitled to make it GPL unless he is the sole copyright holder of the original code. He can certainly incorporate the public domain code and rerelease it under whatever restrictions he likes, especially if he adds to it, which appears to be the case. The original sources are legitimately public domain, not just released under a liberal copyright license. He can't remove the original code from the public domain, but that's not what he claims to have done. I think the 'real' solution is to have a separate package linking to FFTW for people with 'advanced' needs for FFT. None of the other library I have looked at so far are usable, fast and precise enough when you go far from the simple case of double precision and 'well factored' size. http://pypi.python.org/pypi/pyFFTW Nice, I am starting to get out of touch with too many packages... Would be nice to add DCT and DST support to it. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Bug in as_strided/reshape
It seems that reshape doesn't work correctly on an array which has been resized using the 0-stride trick e.g. In [73]: x = array([5]) In [74]: y = as_strided(x, shape=(10,), strides=(0,)) In [75]: y Out[75]: array([5, 5, 5, 5, 5, 5, 5, 5, 5, 5]) In [76]: y.reshape([10,1]) Out[76]: array([[ 5], [ 8], [ 762933412], [-2013265919], [ 26], [ 64], [ 762933414], [-2013244356], [ 26], [ 64]]) Should all be 5 In [77]: y.copy().reshape([10,1]) Out[77]: array([[5], [5], [5], [5], [5], [5], [5], [5], [5], [5]]) In [78]: np.__version__ Out[78]: '1.6.2' Perhaps a clause such as below is required in reshape? if any(stride == 0 for stride in y.strides): return y.copy().reshape(shape) else: return y.reshape(shape) Regards, Dave ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] nested loops too slow
Hi, I'm trying to write a code for doing a 2D integral. It works well when I'm doing it with normal for loops, but requires two nested loops, and is much too slow for my application. I would like to know if it is possible to do it faster, for example with fancy indexing and the use of numpy.cumsum(). I couln't find a solution, do you have an idea ? The code is the following : http://bpaste.net/show/cAkMBd3sUmhDXq0sIpZ5/ 'flux2' is the result of the calculation with 'for' loops implementation, and 'flux' is supposed to be the same result without the loop. If I managed to do it for the single loops (line 23 is identical to lines 20-21, and line 30 is identical to line 27,28) and don't know how to do for the nested loops lines 33-35 (line 40 does not give the same result). Any idea ? Thanks much Nico ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Is there a more efficient way to do this?
Is there a more efficient way to calculate the slices array below? import numpy import numpy.random # In reality, this is between 1 and 50. DIMENSIONS = 20 # In my real app, I have 100...1M data rows. ROWS = 1000 DATA = numpy.random.random_integers(0,100,(ROWS,DIMENSIONS)) # This is between 0..DIMENSIONS-1 DRILLBY = 3 # Array of row incides that orders the data by the given dimension. o = numpy.argsort(DATA[:,DRILLBY]) # Input of my task: the data ordered by the given dimension. print DATA[o,DRILLBY] #~ [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 #~ 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 #~ 2 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 #~ 4 4 4 4 5 5 5 5 5 5 5 5 5 5 6 6 6 6 #~ many more things here #~ 96 96 96 97 97 97 97 97 97 97 97 97 98 98 98 98 98 98 #~ 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 100 100 #~ 100 100 100 100 100 100 100 100 100 100] # Output of my task: determine slices for the same values on the DRILLBY dimension. slices = [] prev_val = None sidx = -1 # Dimension values for the given dimension. fdv = DATA[:,DRILLBY] # Go over the rows, sorted by values of didx for oidx,rowidx in enumerate(o): val = fdv[rowidx] if val!=prev_val: if prev_val is None: prev_val = val sidx = oidx else: slices.append((prev_val,sidx,oidx)) sidx = oidx prev_val = val if (sidx=0) and (sidxROWS): slices.append((val,sidx,ROWS)) slices = numpy.array(slices,dtype=numpy.int64) # This is what I want to have! print slices #~ #~ [[ 00 14] #~ [ 1 14 26] #~ [ 2 26 37] #~ [ 3 37 44] #~ many more values here #~ [ 4 44 58] #~ [ 96 952 957] #~ [ 97 957 966] #~ [ 98 966 972] #~ [ 99 972 988] #~ [ 100 988 1000]] So for example, to get all row incides where dimension value is zero: zeros at rows o[0:14] Or, to get all row incides where dimension value is 99: o[988:1000] etc. I do not want to make copies of DATA, because it can be huge. The argsort is fast enough. I just need to create slices for different dimensions. The above code works, but it does a linear time search, implemented in pure Python code. For every iteration, Python code is executed. For 1 million rows, this is very slow. Is there a way to produce slices with numpy code? I could write C code for this, but I would prefer to do it with mass numpy operations. Thanks, Laszlo ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] building numpy 1.6.2 on OSX 10.6 / Python2.7.3
`python setup.py install --user` should install bumpy in a ~/.local directory, you'll just have to update your PYTHONPATH As of Python 2.6/3.x, ~/.local is searched after is added before the system site directories but after Python's search paths and PYTHONPATH. See PEP 370 for more details if you're curious: http://www.python.org/dev/peps/pep-0370/ A ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] building numpy 1.6.2 on OSX 10.6 / Python2.7.3
FWIW, on my OS X.6.8 system, using a brew-installed python, and installing into /usr/local/lib, the symbols appear to be present: ❯ nm /usr/local/lib/python2.7/site-packages/numpy/core/multiarray.so | grep ceil U _ceil U _ceilf U _ceill 000c6550 T _npy_ceil 000c6340 T _npy_ceilf 000c6760 T _npy_ceill I agree with Dave, we're going to need to see your build log to have a better chance at diagnosing what went wrong with the build. A ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Is there a more efficient way to do this?
On Wed, Aug 8, 2012 at 9:19 AM, Laszlo Nagy gand...@shopzeus.com wrote: Is there a more efficient way to calculate the slices array below? I do not want to make copies of DATA, because it can be huge. The argsort is fast enough. I just need to create slices for different dimensions. The above code works, but it does a linear time search, implemented in pure Python code. For every iteration, Python code is executed. For 1 million rows, this is very slow. Is there a way to produce slices with numpy code? I could write C code for this, but I would prefer to do it with mass numpy operations. Thanks, Laszlo #Code import numpy as np #rows between 100 to 1M rows = 1000 data = np.random.random_integers(0, 100, rows) def get_slices_slow(data): o = np.argsort(data) slices = [] prev_val = None sidx = -1 for oidx, rowidx in enumerate(o): val = data[rowidx] if not val == prev_val: if prev_val is None: prev_val = val sidx = oidx else: slices.append((prev_val, sidx, oidx)) sidx = oidx prev_val = val if (sidx = 0) and (sidx rows): slices.append((val, sidx, rows)) slices = np.array(slices, dtype=np.int64) return slices def get_slices_fast(data): nums = np.unique(data) slices = np.zeros((len(nums), 3), dtype=np.int64) slices[:,0] = nums count = 0 for i, num in enumerate(nums): count += (data == num).sum() slices[i,2] = count slices[1:,1] = slices[:-1,2] return slices def get_slices_faster(data): nums = np.unique(data) slices = np.zeros((len(nums), 3), dtype=np.int64) slices[:,0] = nums count = np.bincount(data) slices[:,2] = count.cumsum() slices[1:,1] = slices[:-1,2] return slices #Testing in ipython In [2]: (get_slices_slow(data) == get_slices_fast(data)).all() Out[2]: True In [3]: (get_slices_slow(data) == get_slices_faster(data)).all() Out[3]: True In [4]: timeit get_slices_slow(data) 100 loops, best of 3: 3.51 ms per loop In [5]: timeit get_slices_fast(data) 1000 loops, best of 3: 1.76 ms per loop In [6]: timeit get_slices_faster(data) 1 loops, best of 3: 116 us per loop So using the fast bincount and array indexing methods gets you about a factor of 30 improvement. Even just doing the counting in a loop with good indexing will get you a factor of 2. ~Brett ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Licensing question
Nice, I am starting to get out of touch with too many packages... Would be nice to add DCT and DST support to it. FWIW, the DCT has been in scipy.fftpack for a while and DST was just added. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] possible bug in assignment to complex array
Dear List, I think there is a problem with assigning a 1D complex array of length one to a position in another complex array. Example: a = ones(1,'D') b = ones(1,'D') a[0] = b --- TypeError Traceback (most recent call last) ipython-input-37-0c4fc6d780e3 in module() 1 a[0] = b TypeError: can't convert complex to float This works correctly when a and b are real arrays: a = ones(1) b = ones(1) a[0] = b Bug or feature? Thanks, Mark ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Is there a more efficient way to do this?
In [4]: timeit get_slices_slow(data) 100 loops, best of 3: 3.51 ms per loop In [5]: timeit get_slices_fast(data) 1000 loops, best of 3: 1.76 ms per loop In [6]: timeit get_slices_faster(data) 1 loops, best of 3: 116 us per loop So using the fast bincount and array indexing methods gets you about a factor of 30 improvement. Even just doing the counting in a loop with good indexing will get you a factor of 2. Fantastic, thank you! I do no fully understand your code yet. But I'm going to read all related docs. :-) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] building numpy 1.6.2 on OSX 10.6 / Python2.7.3
Message: 1 Date: Wed, 8 Aug 2012 08:43:18 +0100 From: David Cournapeau courn...@gmail.com Subject: Re: [Numpy-discussion] building numpy 1.6.2 on OSX 10.6 / Python2.7.3 To: Discussion of Numerical Python numpy-discussion@scipy.org Message-ID: cagy4rcw4b9krqxbibyejew8dzrjwbxcgv4ydzibpwcg2zvh...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 On Wed, Aug 8, 2012 at 6:15 AM, Andrew Nelson andyf...@gmail.com wrote: Dear Pierre, as indicated yesterday OSX system python is in: /System/Library/Frameworks/Python.framework/ I am installing into: /Library/Frameworks/Python.framework/Versions/Current/lib/python2.7/site-packages This should not present a problem and does not explain why numpy does not build/import correctly on my setup. Please give us the build log (when rebuilding from scratch to have the complete log) so that we can have a better idea of the issue, David The build log for the build that fails on my machine can be found at: http://dl.dropbox.com/u/15288921/log Examining the symbols again: p0100m:core anz$ pwd /Users/anz/Downloads/numpy-1.6.2/build/lib.macosx-10.6-intel-2.7/numpy/core p0100m:core anz$ nm multiarray.so | grep ceil U _npy_ceil U _npy_ceil ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion