Many thanks to Josef and Justin for their replies. Josef's hint sounds like a good way of reducing peak memory allocation especially when the row size is large, which makes the "for" overhead for each iteration comparatively lower. However, time is still spent in back-and-forth conversions between numpy arrays and the native BLAS data structures, and copying data from the temporary array holding the intermediate results and tableau.
Regarding Justin's suggestion, before trying Cython (which, according to http://wiki.cython.org/tutorials/numpy , seems to require a bit of work to handle numpy arrays properly) I was looking at weave.blitz . Unfortunately, this doesn't seems to like my code. Running code containing: expr = "tableau = tableau - tableau[:,cand:cand+1]*pivot" weave.blitz(expr) ...elicits: /---------------------------------------------------------- distutils.errors.CompileError: error: Command "g++ -mno-cygwin -O2 -Wall -IC:\Python26\lib\site-packages\scipy\weave -IC:\Python26\lib\site-packages\scipy\weave\scxx -IC:\Python26\lib\site-packages\scipy\weave\blitz -IC:\Python26\lib\site-packages\numpy\core\include -IC:\Python26\include -IC:\Python26\PC -c c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp -o c:\docume~1\admin\locals~1\temp\ADMIN\python26_intermediate\compiler_4b433fbf94fa137eaa5ee69a06987eda\Release\docume~1\admin\locals~1\temp\admin\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.o" failed with exit status 1 \---------------------------------------------------------- >From the error message issued by g++, it would appear that blitz can't figure out the type of cand: /---------------------------------------------------------- C:\Documents and Settings\ADMIN\My Documents\Projects\Valerio\py>g++ -mno-cygwin -O2 -Wall -IC:\Python26\lib\site-packages\scipy\weave -IC:\Python26\lib\site-packages\scipy\weave\scxx -IC:\Python26\lib\site-packages\scipy\weave\blitz -IC:\Python26\lib\site-packages\numpy\core\include -IC:\Python26\include -IC:\Python26\PC -c c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp -o c:\docume~1\admin\locals~1\temp\ADMIN\python26_intermediate\compiler_4b433fbf94fa137eaa5ee69a06987eda\Release\docume~1\admin\locals~1\temp\admin\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.o In file included from C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array-impl.h:37, from C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array.h:26, from c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp:11: C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/range.h: In member function 'bool blitz::Range::isAscendingContiguous() const': C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/range.h:120: warning: suggest parentheses around '&&' within '||' c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp: In function 'PyObject* compiled_func(PyObject*, PyObject*)': c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp:728: error: ambiguous overload for 'operator+' in 'cand + 1' c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp:728: note: candidates are: operator+(PyObject*, int) <built-in> c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp:728: note: operator+(int, int) <built-in> c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp:728: note: operator+(float, int) <built-in> c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp:728: note: operator+(double, int) <built-in> c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp:728: note: operator+(char*, int) <built-in> \---------------------------------------------------------- Using a temporary variable to keep cand out of blitz: tmp = tableau[:,cand:cand+1] expr = "tableau = tableau - tmp*pivot" weave.blitz(expr) ...produces an even uglier error message, which makes me think that blitz doesn't understand that the product between a (n,1)-shaped array and an (n,)-shaped one is meant to be an outer product: /---------------------------------------------------------- C:\Documents and Settings\ADMIN\My Documents\Projects\Valerio\py>LCPSolve.py Found executable C:\Program Files\pythonxy\mingw\bin\g++.exe In file included from C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array-impl.h:37, from C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array.h:26, from c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_c9159c98b571d3181d8848337bf1e50a1.cpp:11: C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/range.h: In member function 'bool blitz::Range::isAscendingContiguous() const': C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/range.h:120: warning: suggest parentheses around '&&' within '||' In file included from C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array-impl.h:2504, from C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array.h:26, from c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_c9159c98b571d3181d8848337bf1e50a1.cpp:11: C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/expr.h: In member function 'typename P_op::T_numtype blitz::_bz_ArrayExprBinaryOp<P_expr1, P_expr2, P_op>::operator()(const blitz::TinyVector<int, N_rank>&) [with int N_rank = 2, P_expr1 = blitz::FastArrayIterator<double, 2>, P_expr2 = blitz::FastArrayIterator<double, 1>, P_op = blitz::Multiply<double, double>]': C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/expr.h:144: instantiated from 'typename P_expr::T_numtype blitz::_bz_ArrayExpr<P_expr>::operator()(const blitz::TinyVector<int, N_rank>&) [with int N_rank = 2, P_expr = blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double, 2>, blitz::FastArrayIterator<double, 1>, blitz::Multiply<double, double> >]' C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/expr.h:486: instantiated from 'typename P_op::T_numtype blitz::_bz_ArrayExprBinaryOp<P_expr1, P_expr2, P_op>::operator()(const blitz::TinyVector<int, N_rank>&) [with int N_rank = 2, P_expr1 = blitz::FastArrayIterator<double, 2>, P_expr2 = blitz::_bz_ArrayExpr<blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double, 2>, blitz::FastArrayIterator<double, 1>, blitz::Multiply<double, double> > >, P_op = blitz::Subtract<double, double>]' C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/expr.h:144: instantiated from 'typename P_expr::T_numtype blitz::_bz_ArrayExpr<P_expr>::operator()(const blitz::TinyVector<int, N_rank>&) [with int N_rank = 2, P_expr = blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double, 2>, blitz::_bz_ArrayExpr<blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double, 2>, blitz::FastArrayIterator<double, 1>, blitz::Multiply<double, double> > >, blitz::Subtract<double, double> >]' C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/eval.cc:670: instantiated from 'blitz::Array<P_numtype, N_rank>& blitz::Array<T, N>::evaluateWithIndexTraversal1(T_expr, T_update) [with T_expr = blitz::_bz_ArrayExpr<blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double, 2>, blitz::_bz_ArrayExpr<blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double, 2>, blitz::FastArrayIterator<double, 1>, blitz::Multiply<double, double> > >, blitz::Subtract<double, double> > >, T_update = blitz::_bz_update<double, double>, P_numtype = double, int N_rank = 2]' C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/eval.cc:171: instantiated from 'blitz::Array<P_numtype, N_rank>& blitz::Array<T, N>::evaluate(T_expr, T_update) [with T_expr = blitz::_bz_ArrayExpr<blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double, 2>, blitz::_bz_ArrayExpr<blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double, 2>, blitz::FastArrayIterator<double, 1>, blitz::Multiply<double, double> > >, blitz::Subtract<double, double> > >, T_update = blitz::_bz_update<double, double>, P_numtype = double, int N_rank = 2]' C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/ops.cc:45: instantiated from 'blitz::Array<P_numtype, N_rank>& blitz::Array<T, N>::operator=(const blitz::ETBase<T_expr>&) [with T_expr = blitz::_bz_ArrayExpr<blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double, 2>, blitz::_bz_ArrayExpr<blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double, 2>, blitz::FastArrayIterator<double, 1>, blitz::Multiply<double, double> > >, blitz::Subtract<double, double> > >, P_numtype = double, int N_rank = 2]' c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_c9159c98b571d3181d8848337bf1e50a1.cpp:732: instantiated from here C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/expr.h:486: error: no match for call to '(blitz::FastArrayIterator<double, 1>) (const blitz::TinyVector<int, 2>&)' C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/fastiter.h:74: note: candidates are: P_numtype blitz::FastArrayIterator<T_numtype, N_rank>::operator()(const blitz::TinyVector<int, N_destRank>&) [with P_numtype = double, int N_rank = 1] C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/fastiter.h:202: note: P_numtype& blitz::FastArrayIterator<T_numtype, N_rank>::operator()(int) [with P_numtype = double, int N_rank = 1] C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/fastiter.h:208: note: P_numtype& blitz::FastArrayIterator<T_numtype, N_rank>::operator()(int, int) [with P_numtype = double, int N_rank = 1] C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/fastiter.h:214: note: P_numtype& blitz::FastArrayIterator<T_numtype, N_rank>::operator()(int, int, int) [with P_numtype = double, int N_rank = 1] Traceback (most recent call last): File "C:\Documents and Settings\ADMIN\My Documents\Projects\Valerio\py\LCPSolve.py", line 132, in <module> w, z, retcode = LCPSolve(M,q) File "C:\Documents and Settings\ADMIN\My Documents\Projects\Valerio\py\LCPSolve.py", line 94, in LCPSolve weave.blitz(expr) File "C:\Python26\lib\site-packages\scipy\weave\blitz_tools.py", line 65, in blitz **kw) File "C:\Python26\lib\site-packages\scipy\weave\inline_tools.py", line 482, in compile_function verbose=verbose, **kw) File "C:\Python26\lib\site-packages\scipy\weave\ext_tools.py", line 367, in compile verbose = verbose, **kw) File "C:\Python26\lib\site-packages\scipy\weave\build_tools.py", line 273, in build_extension setup(name = module_name, ext_modules = [ext],verbose=verb) File "C:\Python26\lib\site-packages\numpy\distutils\core.py", line 186, in setup return old_setup(**new_attr) File "C:\Python26\lib\distutils\core.py", line 169, in setup raise SystemExit, "error: " + str(msg) distutils.errors.CompileError: error: Command "g++ -mno-cygwin -O2 -Wall -IC:\Python26\lib\site-packages\scipy\weave -IC:\Python26\lib\site-packages\scipy\weave\scxx -IC:\Python26\lib\site-packages\scipy\weave\blitz -IC:\Python26\lib\site-packages\numpy\core\include -IC:\Python26\include -IC:\Python26\PC -c c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_c9159c98b571d3181d8848337bf1e50a1.cpp -o c:\docume~1\admin\locals~1\temp\ADMIN\python26_intermediate\compiler_4b433fbf94fa137eaa5ee69a06987eda\Release\docume~1\admin\locals~1\temp\admin\python26_compiled\sc_c9159c98b571d3181d8848337bf1e50a1.o" failed with exit status 1 \---------------------------------------------------------- So, for the time being, no speed breakthrough... Enzo ----- Original Message ----- From: "Justin Peel" <[email protected]> To: "Discussion of Numerical Python" <[email protected]> Sent: Monday, December 27, 2010 2:51 PM Subject: Re: [Numpy-discussion] Optimization suggestion sought On Sun, Dec 26, 2010 at 7:34 AM, <[email protected]> wrote: > On Sun, Dec 26, 2010 at 3:51 AM, Enzo Michelangeli <[email protected]> > wrote: >> For a pivoted algorithm, I have to perform an operation that in fully >> vectorized form can be expressed as: >> >> pivot = tableau[locat,:]/tableau[locat,cand] >> tableau -= tableau[:,cand:cand+1]*pivot >> tableau[locat,:] = pivot >> >> tableau is a rather large bidimensional array, and I'd like to avoid the >> allocation of a temporary array of the same size holding the result of >> the >> right-hand side expression in the second line of code (the outer product >> of >> tableau[:,cand] and pivot). On the other hand, if I replace that line >> with: >> >> for i in xrange(tableau.shape[0]): >> tableau[i] -= tableau[i,cand]*pivot >> >> ...I incur some CPU overhead for the "for" loop -- and this part of code >> is >> the botteneck of the whole algorithm. Is there any smarter (i.e., more >> time-efficient) way of achieving my goal? > > just a generic answer: > > Working in batches can be a good compromise in some cases. I instead > of working in a loop with one row at a time, loop and handle, for > example, 1000 rows at a time. > > Josef > >> >> TIA -- >> >> Enzo >> >> _______________________________________________ >> NumPy-Discussion mailing list >> [email protected] >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > [email protected] > http://mail.scipy.org/mailman/listinfo/numpy-discussion > If this is really such a big bottleneck, then I would look into using Cython for this part. With just a few cdef's, I bet that that you could speed up the for loop tremendously. Depending on the details of your algorithm, you might want to make a Cython function that takes tableau, cand and pivot as inputs and just does the for loop part. _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
