On Tue, Jul 20, 2010 at 6:35 PM, Keith Goodman <kwgood...@gmail.com> wrote: > On Mon, Jul 19, 2010 at 10:08 PM, Charles R Harris > <charlesr.har...@gmail.com> wrote: >> >> >> On Mon, Jul 19, 2010 at 9:40 PM, Keith Goodman <kwgood...@gmail.com> wrote: >>> >>> On Mon, Jul 19, 2010 at 8:27 PM, Charles R Harris >>> <charlesr.har...@gmail.com> wrote: >>> > >>> > >>> > On Mon, Jul 19, 2010 at 9:02 PM, Keith Goodman <kwgood...@gmail.com> >>> > wrote: >>> >> >>> >> On Mon, Jul 19, 2010 at 6:53 PM, Joshua Holbrook >>> >> <josh.holbr...@gmail.com> wrote: >>> >> > On Mon, Jul 19, 2010 at 5:50 PM, Charles R Harris >>> >> > <charlesr.har...@gmail.com> wrote: >>> >> >> Hi All, >>> >> >> >>> >> >> I'm thinking about adding some functionality to lstsq because I find >>> >> >> myself >>> >> >> doing the same fixes over and over. List follows. >>> >> >> >>> >> >> Add weights so data points can be weighted. >>> >> >> Use column scaling so condition numbers make more sense. >>> >> >> Compute covariance approximation? >>> >> >> >>> >> >> Unfortunately, the last will require using svd since there no linear >>> >> >> least >>> >> >> squares routines in LAPACK that also compute the covariance, at >>> >> >> least >>> >> >> that >>> >> >> google knows about. >>> >> >> >>> >> >> Thoughts? >>> >> > >>> >> > Maybe make 2 functions--one which implements 1 and 2, and another >>> >> > which implements 3? I think weights is an excellent idea! >>> >> >>> >> I'd like a lstsq that did less, like not calculate the sum of squared >>> >> residuals. That's useful in tight loops. So I also think having two >>> >> lstsq makes sense. One as basic as possible; one with bells. How does >>> >> scipy's lstsq fit into all this? >>> > >>> > I think the computation of the residues is cheap in lstsq. The >>> > algolrithm >>> > used starts by reducing the design matrix to bidiagonal from and reduces >>> > the >>> > rhs at the same time. In other words an mxn problem becomes a (n+1)xn >>> > problem. That's why the summed square of residuals is available but not >>> > the >>> > individual residuals, after the reduction there is only one residual and >>> > its >>> > square it the residue. >>> >>> That does sound good. But it must take some time. There's indexing, >>> array creation, if statement, summing: >>> >>> if results['rank'] == n and m > n: >>> resids = sum((transpose(bstar)[n:,:])**2, >>> axis=0).astype(result_t) >>> >>> Here are the timings after removing the sum of squared residuals: >>> >>> >> x = np.random.rand(1000,10) >>> >> y = np.random.rand(1000) >>> >> timeit np.linalg.lstsq(x, y) >>> 1000 loops, best of 3: 369 us per loop >>> >> timeit np.linalg.lstsq2(x, y) >>> 1000 loops, best of 3: 344 us per loop >>> >> >>> >> x = np.random.rand(10,2) >>> >> y = np.random.rand(10) >>> >> timeit np.linalg.lstsq(x, y) >>> 10000 loops, best of 3: 102 us per loop >>> >> timeit np.linalg.lstsq2(x, y) >>> 10000 loops, best of 3: 77 us per loop >>> _ >> >> Now that I've looked through the driver program I see that you are right. >> Also, all the info needed for the covariance is almost available in the >> LAPACK driver program. Hmm, it seems that maybe the best thing to do here is >> to pull out the lapack_lite driver program, modify it, and make it a >> standalone python module that links to either the lapack_lite programs or >> the ATLAS versions. But that is more work than just doing things in python >> :-\ It does have the added advantage that all the work arrays can be handled >> internally. > > While taking a quick look at lstsq to see what I could cut, this line > caught my eye: > > bstar[:b.shape[0],:n_rhs] = b.copy()
Even if bstar contained a view of b, the next line in lstsq would take care of it: a, bstar = _fastCopyAndTranspose(t, a, bstar) > I thought that array.__setitem__ never gives a view of the right hand > side. If that's so then the copy is not needed. That would save some > time since b can be large. All unit test pass when I remove the copy, > but you know how that goes... > > I also noticed that "import math" is done inside the lstsq function. > Why is that? > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion