Just pointing out np.loadtxt(..., ndmin=2) will always return a 2D array. Notice that without that option, the result is effectively squeezed. So if you don't specify that option, and you load up a CSV file with only one row, you will get a very differently shaped array than if you load up a CSV file with two rows.
Ben Root On Tue, Nov 10, 2015 at 10:07 AM, Irvin Probst < irvin.pro...@ensta-bretagne.fr> wrote: > On 10/11/2015 14:17, Sebastian Berg wrote: > >> Actually, it is the "sequence special case" type ;). (matlab does not >> have this, since matlab always returns 2-D I realized). >> >> As I said, if usecols is like indexing, the result should mimic: >> >> arr = np.loadtxt(f) >> arr = arr[usecols] >> >> in which case a 1-D array is returned if you put in a scalar into >> usecols (and you could even generalize usecols to higher dimensional >> array-likes). >> The way you implemented it -- which is fine, but I want to stress that >> there is a real decision being made here --, you always see it as a >> sequence but allow a scalar for convenience (i.e. always return a 2-D >> array). It is a `sequence of ints or int` type argument and not an >> array-like argument in my opinion. >> > > I think we have two separate problems here: > > The first one is whether loadtxt should always return a 2D array or should > it match the shape of the usecol argument. From a CS guy point of view I do > understand your concern here. Now from a teacher point of view I know many > people expect to get a "matrix" (thank you Matlab...) and the "purity" of > matching the dimension of the usecol variable will be seen by many people > [1] as a nerdy useless heavyness noone cares of (no offense). So whatever > you, seadoned numpy devs from this mailing list, decide I think it should > be explained in the docstring with a very clear wording. > > My own opinion on this first problem is that loadtxt() should always > return a 2D array, no less, no more. If I write np.loadtxt(f)[42] it means > I want to read the whole file and then I explicitely ask for transforming > the 2-D array loadtxt() returned into a 1-D array. Otoh if I write > loadtxt(f, usecol=42) it means I don't want to read the other columns and I > want only this one, but it does not mean that I want to change the returned > array from 2-D to 1-D. I know this new behavior might break a lot of > existing code as usecol=(42,) used to return a 1-D array, but > usecol=((((42,)))) also returns a 1-D array so the current behavior is not > consistent imho. > > The second problem is about the wording in the docstring, when I see > "sequence of int or int" I uderstand I will have to cast into a 1-D python > list whatever wicked N-dimensional object I use to store my column indexes, > or hope list(my_object) will do it fine. On the other hand when I read > "array-like" the function is telling me I don't have to worry about my > object, as long as numpy knows how to cast it into an array it will be fine. > > Anyway I think something like that: > > import numpy as np > a=[[[2,],[],[],],[],[],[]] > foo=np.loadtxt("CONCARNEAU_2010.txt", usecols=a) > > should just work and return me a 2-D (or 1-D if you like) array with the > data I asked for and I don't think "a" here is an int or a sequence of int > (but it's a good example of why loadtxt() should not match the shape of the > usecol argument). > > To make it short, let the reading function read the data in a consistent > and predictible way and then let the user explicitely change the data's > shape into anything he likes. > > Regards. > > [1] read non CS people trying to switch to numpy/scipy > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion