Re: [Numpy-discussion] np.loadtxt : yet a new implementation...
On 12/2/2008 7:21 AM Joris De Ridder apparently wrote: As a historical note, we used to have scipy.io.read_array which at the time was considered by Travis too slow and too grandiose to be put in Numpy. As a consequence, numpy.loadtxt() was created which was simple and fast. Now it looks like we're going back to something grandiose. But perhaps it can be made grandiose *and* reasonably fast ;-). I hope this consideration remains prominent in this thread. Is the disappearance or read_array the reason for this change? What happened to it? Note that read_array_demo1.py is still in scipy.io despite the loss of read_array. Alan Isaac ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.loadtxt : yet a new implementation...
On 12/2/2008 8:12 AM Alan G Isaac apparently wrote: I hope this consideration remains prominent in this thread. Is the disappearance or read_array the reason for this change? What happened to it? Apologies; it is only deprecated, not gone. Alan Isaac ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug in ma.masked_all()?
Pierre GM wrote: Eric, That's quite a handful you have with this dtype... Here is a simplified example of how I made it: dt = np.dtype({'names': ['a','b'], 'formats': ['f', 'f'], 'titles': ['aaa', 'bbb']}) From page 132 in the numpy book: The fields dictionary is indexed by keys that are the names of the fields. Each entry in the dictionary is a tuple fully describing the field: (dtype, offset[,title]). If present, the optional title can actually be any object (if it is string or unicode then it will also be a key in the fields dictionary, otherwise it’s meta-data). I put the titles in as a sort of additional documentation, and thinking that they might be useful for labeling plots; but it is rather hard to get the titles back out, since they are not directly accessible as an attribute, like names. Probably I should just omit them. Eric So yes, the fix I gave works with nested dtypes and flexible dtypes with a simple name (string, not tuple). I'm a bit surprised with numpy, here. Consider: dt.names ('P', 'D', 'T', 'w', 'S', 'sigtheta', 'theta') So we lose the tuple and get a single string instead, corresponding to the right-hand element of the name.. But this single string is one of the keys of dt.fields, whereas the tuple is not. Puzzling. I'm sure there must be some reference in the numpy book, but I can't look for it now. Anyway: Prior to version 6127, make_mask_descr was substituting the 2nd element of each tuple of a dtype.descr by a bool. Which failed for nested dtypes. Now, we check the field corresponding to a name, which fails in our particular case. I'll be working on it... On Dec 2, 2008, at 1:59 AM, Eric Firing wrote: dt = np.dtype([((' Pressure, Digiquartz [db]', 'P'), 'f4'), ((' Depth [salt water, m]', 'D'), 'f4'), ((' Temperature [ITS-90, deg C]', 'T'), 'f4'), ((' Descent Rate [m/s]', 'w'), 'f4'), ((' Salinity [PSU]', 'S'), 'f4'), ((' Density [sigma-theta, Kg/m^3]', 'sigtheta'), 'f4'), ((' Potential Temperature [ITS-90, deg C]', 'theta'), 'f4')]) np.ma.zeros((2,2), dt) ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug in ma.masked_all()?
Eric, That's quite a handful you have with this dtype... So yes, the fix I gave works with nested dtypes and flexible dtypes with a simple name (string, not tuple). I'm a bit surprised with numpy, here. Consider: dt.names ('P', 'D', 'T', 'w', 'S', 'sigtheta', 'theta') So we lose the tuple and get a single string instead, corresponding to the right-hand element of the name.. But this single string is one of the keys of dt.fields, whereas the tuple is not. Puzzling. I'm sure there must be some reference in the numpy book, but I can't look for it now. Anyway: Prior to version 6127, make_mask_descr was substituting the 2nd element of each tuple of a dtype.descr by a bool. Which failed for nested dtypes. Now, we check the field corresponding to a name, which fails in our particular case. I'll be working on it... On Dec 2, 2008, at 1:59 AM, Eric Firing wrote: dt = np.dtype([((' Pressure, Digiquartz [db]', 'P'), 'f4'), ((' Depth [salt water, m]', 'D'), 'f4'), ((' Temperature [ITS-90, deg C]', 'T'), 'f4'), ((' Descent Rate [m/s]', 'w'), 'f4'), ((' Salinity [PSU]', 'S'), 'f4'), ((' Density [sigma-theta, Kg/m^3]', 'sigtheta'), 'f4'), ((' Potential Temperature [ITS-90, deg C]', 'theta'), 'f4')]) np.ma.zeros((2,2), dt) ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] [F2PY] Fortran call fails in IDLE / PyScripter
Hi all, I compile the followinq code using f2py -c --fcompiler=gnu95 --compiler=mingw32 -m hello subroutine AfficheMessage(szText) character szText*100 write (*,*) szText return end Using python console : import hello hello.affichemessage( Hello) works fine ! I do the same in the program window of IDLE and : - no message is displayed. - the shell restart (or IDLE crah if launched with -n) Same problem with PyScripter IDE. (crash). Any suggestion ? Regards, Christophe ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] fast way to convolve a 2d array with 1d filter
You can use 2D convolution routines either in scipy.signal or numpy.numarray.nd_image Nadav -הודעה מקורית- מאת: [EMAIL PROTECTED] בשם frank wang נשלח: ג 02-דצמבר-08 03:38 אל: numpy-discussion@scipy.org נושא: [Numpy-discussion] fast way to convolve a 2d array with 1d filter Hi, I need to convolve a 1d filter with 8 coefficients with a 2d array of the shape (6,7). I can use convolve to perform the operation for each row. This will involve a for loop with a counter 6. I wonder there is an fast way to do this in numpy without using for loop. Does anyone know how to do it? Thanks Frank _ Access your email online and on the go with Windows Live Hotmail. http://windowslive.com/Explore/Hotmail?ocid=TXT_TAGLM_WL_hotmail_acq_access_112008 winmail.dat___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.loadtxt : yet a new implementation...
On 1 Dec 2008, at 21:47 , Stéfan van der Walt wrote: Hi Pierre 2008/12/1 Pierre GM [EMAIL PROTECTED]: * `genloadtxt` is the base function that makes all the work. It outputs 2 arrays, one for the data (missing values being substituted by the appropriate default) and one for the mask. It would go in np.lib.io I see the code length increased from 200 lines to 800. This made me wonder about the execution time: initial benchmarks suggest a 3x slow-down. Could this be a problem for loading large text files? If so, should we consider keeping both versions around, or by default bypassing all the extra hooks? Regards Stéfan As a historical note, we used to have scipy.io.read_array which at the time was considered by Travis too slow and too grandiose to be put in Numpy. As a consequence, numpy.loadtxt() was created which was simple and fast. Now it looks like we're going back to something grandiose. But perhaps it can be made grandiose *and* reasonably fast ;-). Cheers, Joris P.S. As a reference: http://article.gmane.org/gmane.comp.python.numeric.general/5556/ Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug in ma.masked_all()?
On Dec 2, 2008, at 4:26 AM, Eric Firing wrote: From page 132 in the numpy book: The fields dictionary is indexed by keys that are the names of the fields. Each entry in the dictionary is a tuple fully describing the field: (dtype, offset[,title]). If present, the optional title can actually be any object (if it is string or unicode then it will also be a key in the fields dictionary, otherwise it’s meta-data). I should read it more often... I put the titles in as a sort of additional documentation, and thinking that they might be useful for labeling plots; That's actually quite a good idea... but it is rather hard to get the titles back out, since they are not directly accessible as an attribute, like names. Probably I should just omit them. We could perhaps try a function: def gettitle(dtype, name): try: field = dtype.fields[name] except (TypeError, KeyError): return None else: if len(field) 2: return field[-1] return None ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] [F2PY] Fortran call fails in IDLE / PyScripter
On Tue, Dec 2, 2008 at 9:26 AM, Christophe Chappet [EMAIL PROTECTED] wrote: Hi all, I compile the followinq code using f2py -c --fcompiler=gnu95 --compiler=mingw32 -m hello subroutine AfficheMessage(szText) character szText*100 write (*,*) szText return end Using python console : import hello hello.affichemessage( Hello) works fine ! I do the same in the program window of IDLE and : - no message is displayed. - the shell restart (or IDLE crah if launched with -n) Same problem with PyScripter IDE. (crash). Any suggestion ? Regards, Christophe ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion Is this a write to standard output write (*,*) szText ? Robert Kern mentioned several times that mingw is broken for writing to stdout but I only know about it for stdout in c. I always get a crash when a test compiles a write to stdout in c with mingw on my WindowsXP. But then my impression is that it shouldn't work on the command line either. Since I don't know much about f2py, I'm not sure whether fortran has the same problem as c with mingw. Josef ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug in ma.masked_all()?
Pierre GM wrote: On Dec 2, 2008, at 1:59 AM, Eric Firing wrote: Pierre, Your change fixed masked_all for the example I gave, but I think it introduced a new failure in zeros: Eric, Would you mind giving r6131 a try ? It's rather ugly but looks like it works... So far, so good. Thanks very much. Eric ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.loadtxt : yet a new implementation...
Hi Pierre, I've tested the new loadtxt briefly. Looks good, except that there's a minor bug when trying to use a specific white-space delimiter (e.g. \t) while still allowing other white-space to be allowed in fields (e.g. spaces). Specifically, on line 115 in LineSplitter, we have: self.delimiter = delimiter.strip() or None so if I pass in, say, '\t' as the delimiter, self.delimiter gets set to None, which then causes the default behavior of any-whitespace-is- delimiter to be used. This makes lines like Gene Name\tPubMed ID \tStarting Position get split wrong, even when I explicitly pass in '\t' as the delimiter! Similarly, I believe that some of the tests are formulated wrong: def test_nodelimiter(self): Test LineSplitter w/o delimiter strg = 1 2 3 4 5 # test test = LineSplitter(' ')(strg) assert_equal(test, ['1', '2', '3', '4', '5']) I think that treating an explicitly-passed-in ' ' delimiter as identical to 'no delimiter' is a bad idea. If I say that ' ' is the delimiter, or '\t' is the delimiter, this should be treated *just* like ',' being the delimiter, where the expected output is: ['1', '2', '3', '4', '', '5'] At least, that's what I would expect. Treating contiguous blocks of whitespace as single delimiters is perfectly reasonable when None is provided as the delimiter, but when an explicit delimiter has been provided, it strikes me that the code shouldn't try to further- interpret it... Does anyone else have any opinion here? Zach On Dec 1, 2008, at 1:21 PM, Pierre GM wrote: Well, looks like the attachment is too big, so here's the implementation. The tests will come in another message. genload_proposal.py ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.loadtxt : yet a new implementation...
Zachary Pincus wrote: Specifically, on line 115 in LineSplitter, we have: self.delimiter = delimiter.strip() or None so if I pass in, say, '\t' as the delimiter, self.delimiter gets set to None, which then causes the default behavior of any-whitespace-is- delimiter to be used. This makes lines like Gene Name\tPubMed ID \tStarting Position get split wrong, even when I explicitly pass in '\t' as the delimiter! Similarly, I believe that some of the tests are formulated wrong: def test_nodelimiter(self): Test LineSplitter w/o delimiter strg = 1 2 3 4 5 # test test = LineSplitter(' ')(strg) assert_equal(test, ['1', '2', '3', '4', '5']) I think that treating an explicitly-passed-in ' ' delimiter as identical to 'no delimiter' is a bad idea. If I say that ' ' is the delimiter, or '\t' is the delimiter, this should be treated *just* like ',' being the delimiter, where the expected output is: ['1', '2', '3', '4', '', '5'] At least, that's what I would expect. Treating contiguous blocks of whitespace as single delimiters is perfectly reasonable when None is provided as the delimiter, but when an explicit delimiter has been provided, it strikes me that the code shouldn't try to further- interpret it... Does anyone else have any opinion here? I agree. If the user explicity passes something as a delimiter, we should use it and not try to be too smart. +1 Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] [F2PY] Fortran call fails in IDLE / PyScripter
On Tue, Dec 2, 2008 at 08:26, Christophe Chappet [EMAIL PROTECTED] wrote: Hi all, I compile the followinq code using f2py -c --fcompiler=gnu95 --compiler=mingw32 -m hello subroutine AfficheMessage(szText) character szText*100 write (*,*) szText return end Using python console : import hello hello.affichemessage( Hello) works fine ! I do the same in the program window of IDLE and : - no message is displayed. - the shell restart (or IDLE crah if launched with -n) Same problem with PyScripter IDE. (crash). What version of gfortran are you using (i.e. exactly which binary did you download)? I'm not sure about the crash, but I can say that you will never get the output from a write statement inside the Fortran code to go to the IDLE prompt or PyScripter's window. They are not real terminals and do not capture text going to the process's real STDOUT file pointer. They simply change the sys.stdout object to capture text printed from Python. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.loadtxt : yet a new implementation...
Pierre GM wrote: Well, looks like the attachment is too big, so here's the implementation. The tests will come in another message. A couple of quick nitpicks: 1) On line 186 (in the NameValidator class), you use excludelist.append() to append a list to the end of a list. I think you meant to use excludelist.extend() 2) When validating a list of names, why do you insist on lower casing them? (I'm referring to the call to lower() on line 207). On one hand, this would seem nicer than all upper case, but on the other hand this can cause confusion for someone who sees certain casing of names in the file and expects that data to be laid out the same. Other than those, it's working fine for me here. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: HDF5 for Python 1.0
Just FYI, the Windows installer for 1.0 is now posted at h5py.googlecode.com after undergoing some final testing. Thanks for trying 0.3.0... too bad about matlab. Andrew On Mon, 2008-12-01 at 21:53 -0500, [EMAIL PROTECTED] wrote: Requires * UNIX-like platform (Linux or Mac OS-X); Windows version is in progress I installed version 0.3.0 back in August on WindowsXP, and as far as I remember there were no problems at all with the install, and all tests pass. I thought the interface was really easy to use. But after trying it out I realized that my matlab is too old to understand the generated hdf5 files in an easy-to-use way, and I had to go back to csv-files. Josef ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.loadtxt : yet a new implementation...
On Dec 2, 2008, at 3:12 PM, Ryan May wrote: Pierre GM wrote: Well, looks like the attachment is too big, so here's the implementation. The tests will come in another message. A couple of quick nitpicks: 1) On line 186 (in the NameValidator class), you use excludelist.append() to append a list to the end of a list. I think you meant to use excludelist.extend() Good call. 2) When validating a list of names, why do you insist on lower casing them? (I'm referring to the call to lower() on line 207). On one hand, this would seem nicer than all upper case, but on the other hand this can cause confusion for someone who sees certain casing of names in the file and expects that data to be laid out the same. I recall a life where names were case-insensitives, so 'dates' and 'Dates' and 'DATES' were the same field. It should be easy enough to get rid of that limitations, or add a parameter for case-sensitivity On Dec 2, 2008, at 2:47 PM, Zachary Pincus wrote: Specifically, on line 115 in LineSplitter, we have: self.delimiter = delimiter.strip() or None so if I pass in, say, '\t' as the delimiter, self.delimiter gets set to None, which then causes the default behavior of any-whitespace-is- delimiter to be used. This makes lines like Gene Name\tPubMed ID \tStarting Position get split wrong, even when I explicitly pass in '\t' as the delimiter! OK, I'll check that. I think that treating an explicitly-passed-in ' ' delimiter as identical to 'no delimiter' is a bad idea. If I say that ' ' is the delimiter, or '\t' is the delimiter, this should be treated *just* like ',' being the delimiter, where the expected output is: ['1', '2', '3', '4', '', '5'] Valid point. Well, all, stay tuned for yet another yet another implementation... Other than those, it's working fine for me here. Ryan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.loadtxt : yet a new implementation...
Pierre GM wrote: I think that treating an explicitly-passed-in ' ' delimiter as identical to 'no delimiter' is a bad idea. If I say that ' ' is the delimiter, or '\t' is the delimiter, this should be treated *just* like ',' being the delimiter, where the expected output is: ['1', '2', '3', '4', '', '5'] Valid point. Well, all, stay tuned for yet another yet another implementation... While we're at it, it might be nice to be able to pass in more than one delimiter: ('\t',' '). though maybe that only combination that I'd really want would be something and '\n', which I think is being treated specially already. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.loadtxt : yet a new implementation...
Chris, I can try, but in that case, please write me a unittest, so that I have a clear and unambiguous idea of what you expect. ANFSCD, have you tried the missing_values option ? On Dec 2, 2008, at 5:36 PM, Christopher Barker wrote: Pierre GM wrote: I think that treating an explicitly-passed-in ' ' delimiter as identical to 'no delimiter' is a bad idea. If I say that ' ' is the delimiter, or '\t' is the delimiter, this should be treated *just* like ',' being the delimiter, where the expected output is: ['1', '2', '3', '4', '', '5'] Valid point. Well, all, stay tuned for yet another yet another implementation... While we're at it, it might be nice to be able to pass in more than one delimiter: ('\t',' '). though maybe that only combination that I'd really want would be something and '\n', which I think is being treated specially already. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: HDF5 for Python 1.0
If it's a feature people want, I certainly wouldn't mind looking in to it. I believe PyTables supports bzip2 as well. Adding filters to HDF5 takes a bit of work but is well supported by the library. Andrew On Tue, 2008-12-02 at 22:53 +0100, Stephen Simmons wrote: Do you have any plans to add lzo compression support, in addition to gzip? This is a feature I used a lot in PyTables. Andrew Collette wrote: = Announcing HDF5 for Python (h5py) 1.0 = What is h5py? - HDF5 for Python (h5py) is a general-purpose Python interface to the Hierarchical Data Format library, version 5. HDF5 is a versatile, mature scientific software library designed for the fast, flexible storage of enormous amounts of data. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] PyArray_EMPTY and Cython
After some discussion on the Cython lists I thought I would try my hand at writing some Cython accelerators for empty and zeros. This will involve using PyArray_EMPTY, I have a simple prototype I would like to get working, but currently it segfaults. Any tips on what I might be missing? import numpy as np cimport numpy as np cdef extern from numpy/arrayobject.h: PyArray_EMPTY(int ndims, np.npy_intp* dims, int type, bint fortran) cdef np.ndarray empty(np.npy_intp length): cdef np.ndarray[np.double_t, ndim=1] ret cdef int type = np.NPY_DOUBLE cdef int ndims = 1 cdef np.npy_intp* dims dims = length print dims[0] print type ret = PyArray_EMPTY(ndims, dims, type, False) return ret def test(): cdef np.ndarray[np.double_t, ndim=1] y = empty(10) return y The code seems to print out the correct dims and type info but segfaults when the PyArray_EMPTY call is made. Thanks, Gabriel ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion