2013/10/9 Peter Prettenhofer <[email protected]>: > great - thanks Lars - will prepare a PR
I just realized that I forgot to benchmark the sparse case as well. There, having a C-ordered RHS can still give a speed boost: >>> X = fetch_20newsgroups_vectorized().data >>> Y = np.random.randn(X.shape[1], 20) >>> %timeit X * Y 10 loops, best of 3: 64 ms per loop >>> Yf = np.asfortranarray(Y) >>> %timeit X * Yf 10 loops, best of 3: 72.7 ms per loop With a larger number of classes, it seems to get more extreme: >>> Y = np.random.randn(X.shape[1], 200) >>> Yf = np.asfortranarray(Y) >>> %timeit X * Y 1 loops, best of 3: 381 ms per loop >>> %timeit X * Yf 1 loops, best of 3: 498 ms per loop Though I prefer a working SGD to a fast one that doesn't work ;) ------------------------------------------------------------------------------ October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register > http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
