On Di, 2015-11-10 at 10:24 -0500, Benjamin Root wrote: > Just pointing out np.loadtxt(..., ndmin=2) will always return a 2D > array. Notice that without that option, the result is effectively > squeezed. So if you don't specify that option, and you load up a CSV > file with only one row, you will get a very differently shaped array > than if you load up a CSV file with two rows. >
Oh, well I personally think that default squeeze is an abomination :). Anyway, I just wanted to point out that it is two different possible logics, and we have to pick one. I have a slight preference for the indexing/array-like interpretation, but I am aware that from a usage point of view the sequence one is likely better. I could throw in another option: Throw an explicit error instead of the general. Anyway, I *really* do not have an opinion about what is better. Array-like would only suggest that you also accept buffer interface objects or array_interface stuff. Which in this case is really unnecessary I think. - Sebastian > > Ben Root > > > On Tue, Nov 10, 2015 at 10:07 AM, Irvin Probst > <irvin.pro...@ensta-bretagne.fr> wrote: > On 10/11/2015 14:17, Sebastian Berg wrote: > Actually, it is the "sequence special case" type ;). > (matlab does not > have this, since matlab always returns 2-D I > realized). > > As I said, if usecols is like indexing, the result > should mimic: > > arr = np.loadtxt(f) > arr = arr[usecols] > > in which case a 1-D array is returned if you put in a > scalar into > usecols (and you could even generalize usecols to > higher dimensional > array-likes). > The way you implemented it -- which is fine, but I > want to stress that > there is a real decision being made here --, you > always see it as a > sequence but allow a scalar for convenience (i.e. > always return a 2-D > array). It is a `sequence of ints or int` type > argument and not an > array-like argument in my opinion. > > I think we have two separate problems here: > > The first one is whether loadtxt should always return a 2D > array or should it match the shape of the usecol argument. > From a CS guy point of view I do understand your concern here. > Now from a teacher point of view I know many people expect to > get a "matrix" (thank you Matlab...) and the "purity" of > matching the dimension of the usecol variable will be seen by > many people [1] as a nerdy useless heavyness noone cares of > (no offense). So whatever you, seadoned numpy devs from this > mailing list, decide I think it should be explained in the > docstring with a very clear wording. > > My own opinion on this first problem is that loadtxt() should > always return a 2D array, no less, no more. If I write > np.loadtxt(f)[42] it means I want to read the whole file and > then I explicitely ask for transforming the 2-D array > loadtxt() returned into a 1-D array. Otoh if I write > loadtxt(f, usecol=42) it means I don't want to read the other > columns and I want only this one, but it does not mean that I > want to change the returned array from 2-D to 1-D. I know this > new behavior might break a lot of existing code as > usecol=(42,) used to return a 1-D array, but > usecol=((((42,)))) also returns a 1-D array so the current > behavior is not consistent imho. > > The second problem is about the wording in the docstring, when > I see "sequence of int or int" I uderstand I will have to cast > into a 1-D python list whatever wicked N-dimensional object I > use to store my column indexes, or hope list(my_object) will > do it fine. On the other hand when I read "array-like" the > function is telling me I don't have to worry about my object, > as long as numpy knows how to cast it into an array it will be > fine. > > Anyway I think something like that: > > import numpy as np > a=[[[2,],[],[],],[],[],[]] > foo=np.loadtxt("CONCARNEAU_2010.txt", usecols=a) > > should just work and return me a 2-D (or 1-D if you like) > array with the data I asked for and I don't think "a" here is > an int or a sequence of int (but it's a good example of why > loadtxt() should not match the shape of the usecol argument). > > To make it short, let the reading function read the data in a > consistent and predictible way and then let the user > explicitely change the data's shape into anything he likes. > > Regards. > > [1] read non CS people trying to switch to numpy/scipy > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion
signature.asc
Description: This is a digitally signed message part
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion