Recently I found that my application spends ~65% of time in garbage 
collector. I'm looking for ways to reduce amount of memory produced by 
intermediate results. 
For example, I found that "A * B" may be changed to "A_mul_B!(out, A, B)" 
that uses preallocated "out" buffer and thus almost eliminates additional 
memory allocation. But my application still produces lots of garbage on 
operations like matrix addition/subtraction, multiplication by scalar, etc. 

Are there any other tricks that allow to decrease memory usage?  

Reply via email to