On Thu, Jun 24, 2010 at 1:53 PM, Benjamin Root <ben.r...@ou.edu> wrote:
> On Thu, Jun 24, 2010 at 1:00 PM, Warren Weckesser < > warren.weckes...@enthought.com> wrote: > >> Benjamin Root wrote: >> > Hi, >> > >> > I was having the hardest time trying to figure out an intermittent bug >> > in one of my programs. Essentially, in some situations, it was >> > throwing an error saying that the array object was not an array. It >> > took me a while, but then I figured out that my program was assuming >> > that the object returned from a loadtxt() call was always a structured >> > array (I was using dtypes). However, if the data file being loaded >> > only had one data record, then all you get back is a structured record. >> > >> > import numpy as np >> > from StringIO import StringIO >> > >> > strData = StringIO("89.23 47.2\n13.2 42.2") >> > a = np.loadtxt(strData, dtype=[('x', float), ('y', float)]) >> > print "Length Two" >> > print a >> > print a.shape >> > print len(a) >> > >> > strData = StringIO("53.2 49.2") >> > a = np.loadtxt(strData, dtype=[('x', float), ('y', float)]) >> > print "\n\nLength One" >> > print a >> > print a.shape >> > try : >> > print len(a) >> > except TypeError as err >> > print "ERROR:", err >> > >> > Which gets me this output: >> > >> > Length Two >> > [(89.230000000000004, 47.200000000000003) >> > (13.199999999999999, 42.200000000000003)] >> > (2,) >> > 2 >> > >> > >> > Length One >> > (53.200000000000003, 49.200000000000003) >> > () >> > ERROR: len() of unsized object >> > >> > >> > Note that this isn't restricted to structured arrays. For regular >> > ndarrays, loadtxt() appears to mimic the behavior of np.squeeze(): >> >> Exactly. The last four lines of the function are: >> >> X = np.squeeze(X) >> if unpack: >> return X.T >> else: >> return X >> >> > >> > >>> a = np.ones((1, 1, 1)) >> > >>> np.squeeze(a)[0] >> > IndexError: 0-d arrays can't be indexed >> > >> > >>> strData = StringIO("53.2") >> > >>> a = np.loadtxt(strData) >> > >>> a[0] >> > IndexError: 0-d arrays can't be indexed >> > >> > So, if you have multiple lines with multiple columns, you get a 2-D >> > array, as expected. >> > if you have a single line of data with multiple columns, you get a 1-D >> > array. >> > If you have a single column with many lines, you also get a 1-D array >> > (which is probably expected, I guess). >> > If you have a single column with a single line, you get a scalar >> > (actually, a 0-D array). >> > >> > Is this a bug or a feature? I can see the advantages of having >> > loadtxt() returning the lowest # of dimensions that can hold the given >> > data, but it leaves the code vulnerable to certain edge cases. Maybe >> > there is a different way I should be doing this, but I feel that this >> > behavior at the very least should be included in the loadtxt >> > documentation. >> > >> >> It would be useful to be able to tell loadtxt to not call squeeze, so a >> program that reads column-formatted data doesn't have to treat the case >> of a single line specially. >> >> Warren >> > > I don't know if that is the best way to solve the problem. In that case, > you would always get a 2-D array, right? Is that useful for those who have > text data as a single column? Maybe a mindim keyword (with None as default) > and apply an appropriate "atleast_Nd()" call (or maybe have available an > .atleast_nd() function?). But, then what would this mean for structured > arrays? One might think that they want at least 2-D, but they really want > at least 1-D. > > Ben Root > > P.S. - Taking this a step further, the functions completely fail in dealing > with empty files... In MATLAB, it returns an empty array (matrix?). > I am reviving this "dead" thread to note that I have filed ticket #1562 on the numpy Trac about this issue: http://projects.scipy.org/numpy/ticket/1562 Ben Root
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion