I just resended this email as it was rejected by the mailing list. So I subscribed to it.
Hi, Sorry for the delay, I had some schedule change. thanks for adding me. Should I subscribe to cython-dev? How much email daily there is? I didn't found this on the archives. Fell free to add me in CC again when you think it is appropriate. I'll reply here to all email at the same time. Do you prefer that I reply to each email individually if this happen again? I'll try to reply faster next time. - About pickling theano, we currently can't pick Theano function. It could be made to work in some cases, but not for all cases as there is hardware dependent optimization in the Theano function. Currently it is mostly CPU vs GPU operation. So if we stay on the CPU, we could do some pickling, but we should make sure that the compiled c code into python module are still there when we unpickle or recompile them. - I think it make sense to make a theano graph from cython ast, optimize and redo a cython ast from the optimized graph. This would allow using Theano optimizations. - It also make sense to do the code generation in Theano and reuse it in Cython. But this would make the Theano dependency much stronger. I'm not sure you want this. - Another point not raised, theano need to know at compile time is the dtype, number of dimensions and witch dimensions are broadcastable for each variable. I think that the last one could cause problem, but if you use specialization for the dtype, the same can be done for the broadcsatability of a dimensions. - The compyte(gpu nd array) project do collapsing of dimensions. This is an important optimization on the GPU as doing the index computation in parallel is costlier. I think on the CPU we could probably do collapsing just of the inner dimensions to make it faster. - Theano don't generate intrinsect or assembly, but we suppose that g++ will generate vectorized operation for simple loop. Recent version of gcc/g++ do this. - Our generated code for element-wise operation take care a little about the memory access pattern. We swap dimensions to iterate on the dimensions with the smallest strides. But we don't go further. - What do you mean by CSE? Constant optimization? Fred _______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel