larroy commented on issue #14570: add a compiler flag to use int64 as tensor 
size
URL: https://github.com/apache/incubator-mxnet/pull/14570#issuecomment-484581787
 
 
   Thanks a lot for the detailed report and analysis @apeforest and 
@samskalicky . From the data you guys have provided I'm missing a disassembly 
dump of the small loop that is supposed to be slow. Could you guys provide this 
with "objdump -d" or similar? I find it surprising that the degradation is only 
due to data widening. I suspect the cause is more the memory access than the 
wider arithmetic operation itself. I think having the additional data point of 
the assembly would help reach a more solid conclusion.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to