ptrendx commented on issue #17767: [WIP] Fix and optimize handling of vectorized memory accesses URL: https://github.com/apache/incubator-mxnet/pull/17767#issuecomment-601959240 @haojin2 I'm hitting again the OoM on Windows build, like you did before. I started looking at the numpy versions of those functions and I see that you have way more templates there (some of which are actually not compiled on Windows) and I started thinking that we should probably just switch to runtime compilation of those kernels - there is just too many variants here. What do you think about this (also @eric-haibin-lin @szha @leezu for comments)? Also - I don't see elementwise ops in the numpy python package, just broadcast ops - this is pretty bad because knowledge that the shapes are the same is pretty important in optimizations - for example the pointwise fusion would not really work for such operators.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
