On Mo, 2014-07-28 at 16:31 +0200, Eelco Hoogendoorn wrote: > Sebastian: > > > Those are good points. Indeed iteration order may already produce > different results, even though the semantics of numpy suggest > identical operations. Still, I feel this different behavior without > any semantical clues is something to be minimized. > > Indeed copying might have large speed implications. But on second > thought, does it? Either the data is already aligned and no copy is > required, or it isn't aligned, and we need one pass of cache > inefficient access to the data anyway. Infact, if we had one low level > function which does cache-intelligent transposition of numpy arrays > (using some block strategy), this might be faster even than the > current simple reduction operations when forced to work on awkwardly > aligned data. Ideally, this intelligent access and intelligent > reduction would be part of a single pass of course; but that wouldn't > really fit within the numpy design, and merely such an intelligent > transpose would provide most of the benefit I think. Or is the > mechanism behind ascontiguousarray already intelligent in this sense? >
The iterator is currently smart in the sense that it will (obviously low level), do something like [1]. Most things in numpy use that iterator, ascontiguousarray does so as well. Such a blocked cache aware iterator is what I mean by, someone who knows it would have to spend a lot of time on it :). [1] Appendix: arr = np.ones((100, 100)) arr.sum(1) # being equivalent (iteration order wise) to: res = np.zeros(100) for i in range(100): res += arr[i, :] # while arr.sum(0) would be: for i in range(100): res[i] = arr[i, :].sum() > > On Mon, Jul 28, 2014 at 4:06 PM, Sebastian Berg > <sebast...@sipsolutions.net> wrote: > On Mo, 2014-07-28 at 15:35 +0200, Sturla Molden wrote: > > On 28/07/14 15:21, alex wrote: > > > > > Are you sure they always give different results? Notice > that > > > np.ones((N,2)).mean(0) > > > np.ones((2,N)).mean(1) > > > compute means of different axes on transposed arrays so > these > > > differences 'cancel out'. > > > > They will be if different algorithms are used. > np.ones((N,2)).mean(0) > > will have larger accumulated rounding error than > np.ones((2,N)).mean(1), > > if only the latter uses the divide-and-conquer summation. > > > > > What I wanted to point out is that to some extend the > algorithm does not > matter. You will not necessarily get identical results already > if you > use a different iteration order, and we have been doing that > for years > for speed reasons. All libs like BLAS do the same. > Yes, the new changes make this much more dramatic, but they > only make > some paths much better, never worse. It might be dangerous, > but only in > the sense that you test it with the good path and it works > good enough, > but later (also) use the other one in some lib. I am not even > sure if I > > > I would suggest that in the first case we try to copy the > array to a > > temporary contiguous buffer and use the same > divide-and-conquer > > algorithm, unless some heuristics on memory usage fails. > > > > > Sure, but you have to make major changes to the buffered > iterator to do > that without larger speed implications. It might be a good > idea, but it > requires someone who knows this stuff to spend a lot of time > and care in > the depths of numpy. > > > Sturla > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion@scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion