ptrendx commented on issue #17767: [WIP] Fix and optimize handling of 
vectorized memory accesses
URL: https://github.com/apache/incubator-mxnet/pull/17767#issuecomment-601959240
 
 
   @haojin2 I'm hitting again the OoM on Windows build, like you did before. I 
started looking at the numpy versions of those functions and I see that you 
have way more templates there (some of which are actually not compiled on 
Windows) and I started thinking that we should probably just switch to runtime 
compilation of those kernels - there is just too many variants here. What do 
you think about this (also @eric-haibin-lin @szha @leezu  for comments)?
   
   Also - I don't see elementwise ops in the numpy python package, just 
broadcast ops - this is pretty bad because knowledge that the shapes are the 
same is pretty important in optimizations - for example the pointwise fusion 
would not really work for such operators.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to