On Wed, Dec 28, 2011 at 1:57 PM, Dag Sverre Seljebotn < d.s.seljeb...@astro.uio.no> wrote:
> On 12/28/2011 01:52 PM, Dag Sverre Seljebotn wrote: > > On 12/28/2011 09:33 AM, Ralf Gommers wrote: > >> > >> > >> 2011/12/27 Jordi Gutiérrez Hermoso<jord...@octave.org > >> <mailto:jord...@octave.org>> > >> > >> On 26 December 2011 14:56, Ralf Gommers< > ralf.gomm...@googlemail.com > >> <mailto:ralf.gomm...@googlemail.com>> wrote: > >> > > >> > > >> > On Mon, Dec 26, 2011 at 8:50 PM,<josef.p...@gmail.com > >> <mailto:josef.p...@gmail.com>> wrote: > >> >> I have a hard time thinking through empty 2-dim arrays, and > >> don't know > >> >> what rules should apply. > >> >> However, in my code I might want to catch these cases rather > early > >> >> than late and then having to work my way backwards to find > out where > >> >> the content disappeared. > >> > > >> > > >> > Same here. Almost always, my empty arrays are either due to > bugs > >> or they > >> > signal that I do need to special-case something. Silent passing > >> through of > >> > empty arrays to all numpy functions is not what I would want. > >> > >> I find it quite annoying to treat the empty set with special > >> deference. "All of my great-grandkids live in Antarctica" should be > >> true for me (I'm only 30 years old). If you decide that is not true > >> for me, it leads to a bunch of other logical annoyances up there > >> > >> > >> Guess you don't mean true/false, because it's neither. But I understand > >> you want an empty array back instead of an error. > >> > >> Currently the problem is that when you do get that empty array back, > >> you'll then use that for something else and it will probably still > >> crash. Many numpy functions do not check for empty input and will still > >> give exceptions. My impression is that you're better off handling these > >> where you create the empty array, rather than in some random place later > >> on. The alternative is to have consistent rules for empty arrays, and > >> handle them explicitly in all functions. Can be done, but is of course a > >> lot of work and has some overhead. > > > > Are you saying that the existence of other bugs means that this bug > > shouldn't be fixed? I just fail to see the relevance of these other bugs > > to this discussion. > See below. > > For the record, I've encountered this bug many times myself and it's > > rather irritating, since it leads to more verbose code. > > > > It is useful whenever you want to return data that is a subset of the > > input data (since the selected subset can usually be zero-sized > > sometimes -- remember, in computer science the only numbers are 0, 1, > > and "any number"). > > > > Here's one of the examples I've had. The Interpolative Decomposition > > decomposes a m-by-n matrix A of rank k as > > > > A = B C > > > > where B is an m-by-k matrix consisting of a subset of the columns of A, > > and C is a k-by-n matrix. > > > > Now, if A is all zeros (which is often the case for me), then k is 0. I > > would still like to create the m-by-0 matrix B by doing > > > > B = A[:, selected_columns] > > > > But now I have to do this instead: > > > > if len(selected_columns) == 0: > > B = np.zeros((A.shape[0], 0), dtype=A.dtype) > > else: > > B = A[:, selected_columns] > > > > In this case, zero-sized B and C are of course perfectly valid and > > useful results: > > > > In [2]: np.dot(np.ones((3,0)), np.ones((0, 5))) > > Out[2]: > > array([[ 0., 0., 0., 0., 0.], > > [ 0., 0., 0., 0., 0.], > > [ 0., 0., 0., 0., 0.]]) > > > > And to answer the obvious question: Yes, this is a real usecase. It is > used for something similar to image compression, where sub-sections of > the images may well be all-zero and have zero rank (full story at [1]). > > Thanks for the example. I was a little surprised that dot works. Then I read what wikipedia had to say about empty arrays. It mentions dot like you do, and that the determinant of the 0-by-0 matrix is 1. So I try: In [1]: a = np.zeros((0,0)) In [2]: a Out[2]: array([], shape=(0, 0), dtype=float64) In [3]: np.linalg.det(a) Parameter 4 to routine DGETRF was incorrect <segfault> Reading the above thread I understand Ralf's reasoning better, but > really, relying on NumPy's buggy behaviour to discover bugs in user code > seems like the wrong approach. Tools should be dumb unless there are > good reasons to make them smart. I'd be rather irritated about my hammer > if it refused to drive in nails that it decided where in the wrong spot. > The point is not that we shouldn't fix it, but that it's a waste of time to fix it in only one place. I remember fixing several functions to explicitly check for empty arrays and then returning an empty array or giving a sensible error. So can you answer my question: do you think it's worth the time and computational overhead to handle empty arrays in all functions? Ralf
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion