Re: [Numpy-discussion] unique 2d arrays
Hey Josef, I didn't stumble upon these posts. Thanks for the hint...it doesn't look very pythonic or matlab like still. This would be a nice thing to have a unique function that is able to take an axis argument. Cheers. Peter josef.p...@gmail.com wrote: On Tue, Sep 21, 2010 at 2:55 AM, Peter Schmidtke pschmid...@mmb.pcb.ub.es wrote: Dear all, I'd like to know if there is a pythonic / numpy way of retrieving unique lines of a 2d numpy array. In a way I have this : [[409 152] [409 152] [409 152] [409 152] [409 152] [409 152] [409 152] [409 152] [409 152] [409 152] [409 152] [426 193] [431 129]] And I'd like to get this : [[409 152] [426 193] [431 129]] How can I do this without workarounds like string concatenation or such things? Numpy.unique flattens the whole array so it's not really of use here. One possibility see thread at http://mail.scipy.org/pipermail/numpy-discussion/2009-August/044664.html Josef Cheers. -- Peter Schmidtke PhD Student Dept. Physical Chemistry Faculty of Pharmacy University of Barcelona ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Peter Schmidtke PhD Student Dept. Physical Chemistry Faculty of Pharmacy University of Barcelona ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] module compiled against ABI version 2000000 but this version of numpy is 1000009
Hi all, I am trying to install manually the latest releases of scipy and numpy on Mac OSX 10.6 Snow Leopard. I previously used the dmg installer that is available, but the numpy version is too new for some other modules I have and need on my machine, so I went for a manual install of numpy 1.4 and scipy 0.8. I can import numpy without problems, but when I import scipy.cluster for example, I get an error message like : module compiled against ABI version 200 but this version of numpy is 109 I understand that both scipy and numpy are not compatible...but what numpy version should I use? I already use the newest release and I cannot go for the svn version as pycuda won't work as it should anymore. Thanks in advance for your lights on this. Peter Schmidtke - PhD Student Department of Physical Chemistry School of Pharmacy University of Barcelona Barcelona, Spain ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] module compiled against ABI version 2000000 but this version of numpy is 1000009
On 29/07/2010, at 19:01, Pauli Virtanen wrote: Thu, 29 Jul 2010 18:45:38 +0200, Peter Schmidtke wrote: I am trying to install manually the latest releases of scipy and numpy on Mac OSX 10.6 Snow Leopard. I previously used the dmg installer that is available, but the numpy version is too new for some other modules I have and need on my machine, Which dmg installer? Numpy 1.4.1 is the newest one available. I used those things here : http://stronginference.com/scipy-superpack/ so I went for a manual install of numpy 1.4 and scipy 0.8. How is manually compiling 1.4.1 different from using the 1.4.1 dmg? Or is this some OSX issue where you have multiple versions of Python around? well the scipy superpack uses numpy2.0 (which is a dev version I suppose) and scipy 0.9. I can import numpy without problems, but when I import scipy.cluster for example, I get an error message like: module compiled against ABI version 200 but this version of numpy is 109 You have compiled scipy not against 1.4.1, but some other version of Numpy -- probably the SVN trunk, or possibly 1.4.0 (a release that was cancelled because of binary incompatibility). well I thought that I got rid of all resting numpy pieces on the system, but I'll double check You need to recompile scipy, and check that you use the version of Numpy you expect to use. I understand that both scipy and numpy are not compatible...but what numpy version should I use? I already use the newest release The above indeed shows that you are probably using 1.4.1, but the scipy you import was compiled against either the SVN version of Numpy or 1.4.0. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Peter Schmidtke - PhD Student Department of Physical Chemistry School of Pharmacy University of Barcelona Barcelona, Spain ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] 3d plane to point cloud fitting using SVD
Dear Numpy Users, I want to fit a 3d plane into a 3d point cloud and I saw that one could use svd for this purpose. So as I am very fond of numpy I saw that svd was implementented in the linalg module. Currently I have a numpy array called xyz with n lines (number of points) and 3 columns (x,y,z). I calculated the centroid as : xyz0=npy.mean(xyz, axis=0) #calculate the centroid Next I shift the centroid of the point cloud to the origin with. M=xyz-xyz0 next I saw by matlab analogy (http://www.mathworks.co.jp/matlabcentral/newsreader/view_thread/262996) that I can write this : u,s,vh=numpy.linalg.linalg.svd(M) Then in the matlab analog they use the last column of vh to get the a,b,c coefficients for the equation a,b,c=vh[:, -1] in numpy The problem is that the equation ax+by+cz=0 does not represent the plan through my point cloud at all. What am I doing wrong, how can I get the a,b and c coefficients? Thanks in advance. -- Peter Schmidtke -- PhD Student at the Molecular Modeling and Bioinformatics Group Dep. Physical Chemistry Faculty of Pharmacy University of Barcelona ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] finding close together points.
On Tue, 10 Nov 2009 16:07:32 -0800, Christopher Barker chris.bar...@noaa.gov wrote: Hi all, I have a bunch of points in 2-d space, and I need to find out which pairs of points are within a certain distance of one-another (regular old Euclidean norm). How big is your set of points? scipy.spatial.KDTree.query_ball_tree() seems like it's built for this. However, I'm a bit confused. The first argument is a kdtree, but I'm calling it as a method of a kdtree -- I want to know which points in the tree I already have are closer that some r from each-other. If I call it as: tree.query_ball_tree(tree, r) I get a big list, that has all the points in it (some of them paired up with close neighbors.) It appears I'm getting the distances between all the points in the tree and itself, as though they were different trees. This is slow, takes a bunch of memory, and I then have to parse out the list to find the ones that are paired up. Is there a way to get just the close ones from the single tree? thanks, -Chris -- Peter Schmidtke -- PhD Student at the Molecular Modeling and Bioinformatics Group Dep. Physical Chemistry Faculty of Pharmacy University of Barcelona ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] reading gzip compressed files using numpy.fromfile
Date: Wed, 28 Oct 2009 20:31:43 +0100 From: Peter Schmidtke pschmid...@mmb.pcb.ub.es Subject: [Numpy-discussion] reading gzip compressed files using numpy.fromfile To: numpy-discussion@scipy.org Message-ID: fc345224bfa26132e9474287e32e0...@mmb.pcb.ub.es Content-Type: text/plain; charset=UTF-8 Dear Numpy Mailing List Readers, I have a quite simple problem, for what I did not find a solution for now. I have a gzipped file lying around that has some numbers stored in it and I want to read them into a numpy array as fast as possible but only a bunch of data at a time. So I would like to use numpys fromfile funtion. For now I have somehow the following code : f=gzip.open( myfile.gz, r ) xyz=npy.fromfile(f,dtype=float32,count=400) So I would read 400 entries from the file, keep it open, process my data, come back and read the next 400 entries. If I do this, numpy is complaining that the file handle f is not a normal file handle : OError: first argument must be an open file but in fact it is a zlib file handle. But gzip gives access to the normal filehandle through f.fileobj. So I tried xyz=npy.fromfile(f.fileobj,dtype=float32,count=400) But there I get just meaningless values (not the actual data) and when I specify the sep= argument for npy.fromfile I get just .1 and nothing else. Can you tell me why and how to fix this problem? I know that I could read everything to memory, but these files are rather big, so I simply have to avoid this. Thanks in advance. -- Peter Schmidtke -- PhD Student at the Molecular Modeling and Bioinformatics Group Dep. Physical Chemistry Faculty of Pharmacy University of Barcelona -- Message: 2 Date: Wed, 28 Oct 2009 14:33:11 -0500 From: Robert Kern robert.k...@gmail.com Subject: Re: [Numpy-discussion] reading gzip compressed files using numpy.fromfile To: Discussion of Numerical Python numpy-discussion@scipy.org Message-ID: 3d375d730910281233r5cadd0fcubea14676a3a97...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 On Wed, Oct 28, 2009 at 14:31, Peter Schmidtke pschmid...@mmb.pcb.ub.es wrote: Dear Numpy Mailing List Readers, I have a quite simple problem, for what I did not find a solution for now. I have a gzipped file lying around that has some numbers stored in it and I want to read them into a numpy array as fast as possible but only a bunch of data at a time. So I would like to use numpys fromfile funtion. For now I have somehow the following code : ? ? ? ?f=gzip.open( myfile.gz, r ) xyz=npy.fromfile(f,dtype=float32,count=400) So I would read 400 entries from the file, keep it open, process my data, come back and read the next 400 entries. If I do this, numpy is complaining that the file handle f is not a normal file handle : OError: first argument must be an open file but in fact it is a zlib file handle. But gzip gives access to the normal filehandle through f.fileobj. np.fromfile() requires a true file object, not just a file-like object. np.fromfile() works by grabbing the FILE* pointer underneath and using C system calls to read the data, not by calling the .read() method. So I tried ?xyz=npy.fromfile(f.fileobj,dtype=float32,count=400) But there I get just meaningless values (not the actual data) and when I specify the sep= argument for npy.fromfile I get just .1 and nothing else. This is reading the compressed data, not the data that you want. Can you tell me why and how to fix this problem? I know that I could read everything to memory, but these files are rather big, so I simply have to avoid this. Read in reasonably-sized chunks of bytes at a time, and use np.fromstring() to create arrays from them. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco -- Message: 3 Date: Wed, 28 Oct 2009 13:26:41 -0700 From: Christopher Barker chris.bar...@noaa.gov Subject: Re: [Numpy-discussion] reading gzip compressed files using numpy.fromfile To: Discussion of Numerical Python numpy-discussion@scipy.org Message-ID: 4ae8a901.3060...@noaa.gov Content-Type: text/plain; charset=UTF-8; format=flowed Robert Kern wrote: f=gzip.open( myfile.gz, r ) xyz=npy.fromfile(f,dtype=float32,count=400) Read in reasonably-sized chunks of bytes at a time, and use np.fromstring() to create arrays from them. Something like: count = 400 xyz = np.fromstring(f.read(count*4), dtype=np.float32) should work (untested...) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526
[Numpy-discussion] numpy loadtxt - ValueError: setting an array element with a sequence.
Have you tried the numpy.fromfile function? This usually worked great for my files that had the same format than yours. ++ Peter -- PhD Student at the Molecular Modeling and Bioinformatics Group Dep. Physical Chemistry Faculty of Pharmacy University of Barcelona ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy loadtxt - ValueError: setting an array element with a sequence.
On Thu, 29 Oct 2009 05:30:09 -0700 (PDT), TheLonelyStar nabb...@lonely-star.org wrote: Adter trying the same thing in matlab, I realized that my tsv file is not matrix-style. But this I mean, not all lines ave the same lenght (not the same number of values). What would be the best way to load this? Regards, Nathan Use the numpy fromfile function : For instance I read the file : 5 8 5 5.5 6.1 3 5.5 2 6.5 with : x=npy.fromfile(test.txt,sep=\t) and it returns an array x : array([ 5. , 8. , 5. , 5.5, 6.1, 3. , 5.5, 2. , 6.5]) You can reshape this array to a 3x3 matrix using the reshape function - x.reshape((3,3)) -- Peter Schmidtke -- PhD Student at the Molecular Modeling and Bioinformatics Group Dep. Physical Chemistry Faculty of Pharmacy University of Barcelona ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] reading gzip compressed files using numpy.fromfile
Dear Numpy Mailing List Readers, I have a quite simple problem, for what I did not find a solution for now. I have a gzipped file lying around that has some numbers stored in it and I want to read them into a numpy array as fast as possible but only a bunch of data at a time. So I would like to use numpys fromfile funtion. For now I have somehow the following code : f=gzip.open( myfile.gz, r ) xyz=npy.fromfile(f,dtype=float32,count=400) So I would read 400 entries from the file, keep it open, process my data, come back and read the next 400 entries. If I do this, numpy is complaining that the file handle f is not a normal file handle : OError: first argument must be an open file but in fact it is a zlib file handle. But gzip gives access to the normal filehandle through f.fileobj. So I tried xyz=npy.fromfile(f.fileobj,dtype=float32,count=400) But there I get just meaningless values (not the actual data) and when I specify the sep= argument for npy.fromfile I get just .1 and nothing else. Can you tell me why and how to fix this problem? I know that I could read everything to memory, but these files are rather big, so I simply have to avoid this. Thanks in advance. -- Peter Schmidtke -- PhD Student at the Molecular Modeling and Bioinformatics Group Dep. Physical Chemistry Faculty of Pharmacy University of Barcelona ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion