On Thu, Jun 24, 2010 at 1:00 PM, Warren Weckesser < warren.weckes...@enthought.com> wrote:
> Benjamin Root wrote: > > Hi, > > > > I was having the hardest time trying to figure out an intermittent bug > > in one of my programs. Essentially, in some situations, it was > > throwing an error saying that the array object was not an array. It > > took me a while, but then I figured out that my program was assuming > > that the object returned from a loadtxt() call was always a structured > > array (I was using dtypes). However, if the data file being loaded > > only had one data record, then all you get back is a structured record. > > > > import numpy as np > > from StringIO import StringIO > > > > strData = StringIO("89.23 47.2\n13.2 42.2") > > a = np.loadtxt(strData, dtype=[('x', float), ('y', float)]) > > print "Length Two" > > print a > > print a.shape > > print len(a) > > > > strData = StringIO("53.2 49.2") > > a = np.loadtxt(strData, dtype=[('x', float), ('y', float)]) > > print "\n\nLength One" > > print a > > print a.shape > > try : > > print len(a) > > except TypeError as err > > print "ERROR:", err > > > > Which gets me this output: > > > > Length Two > > [(89.230000000000004, 47.200000000000003) > > (13.199999999999999, 42.200000000000003)] > > (2,) > > 2 > > > > > > Length One > > (53.200000000000003, 49.200000000000003) > > () > > ERROR: len() of unsized object > > > > > > Note that this isn't restricted to structured arrays. For regular > > ndarrays, loadtxt() appears to mimic the behavior of np.squeeze(): > > Exactly. The last four lines of the function are: > > X = np.squeeze(X) > if unpack: > return X.T > else: > return X > > > > > >>> a = np.ones((1, 1, 1)) > > >>> np.squeeze(a)[0] > > IndexError: 0-d arrays can't be indexed > > > > >>> strData = StringIO("53.2") > > >>> a = np.loadtxt(strData) > > >>> a[0] > > IndexError: 0-d arrays can't be indexed > > > > So, if you have multiple lines with multiple columns, you get a 2-D > > array, as expected. > > if you have a single line of data with multiple columns, you get a 1-D > > array. > > If you have a single column with many lines, you also get a 1-D array > > (which is probably expected, I guess). > > If you have a single column with a single line, you get a scalar > > (actually, a 0-D array). > > > > Is this a bug or a feature? I can see the advantages of having > > loadtxt() returning the lowest # of dimensions that can hold the given > > data, but it leaves the code vulnerable to certain edge cases. Maybe > > there is a different way I should be doing this, but I feel that this > > behavior at the very least should be included in the loadtxt > > documentation. > > > > It would be useful to be able to tell loadtxt to not call squeeze, so a > program that reads column-formatted data doesn't have to treat the case > of a single line specially. > > Warren > I don't know if that is the best way to solve the problem. In that case, you would always get a 2-D array, right? Is that useful for those who have text data as a single column? Maybe a mindim keyword (with None as default) and apply an appropriate "atleast_Nd()" call (or maybe have available an .atleast_nd() function?). But, then what would this mean for structured arrays? One might think that they want at least 2-D, but they really want at least 1-D. Ben Root P.S. - Taking this a step further, the functions completely fail in dealing with empty files... In MATLAB, it returns an empty array (matrix?).
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion