Hi @FrozenGene 
Just to clarify: I am enjoying the discussion, and since the optimization space 
is wild, I agree that is worth valuating different approaches. 
* About the Raspberry+mobilenet v2, good to know you are working on Armv8-A 
(sorry to have assumed otherwise). However, there is still the point that 
mobilenet uses shallow convolutions, while I am addressing deeper and more 
generic convolutions. 
* Are you saying that, as things stand now in TVM, the 
`conv2d_nhwc_spatial_pack` schedule might be faster than the gemm approach on 
smaller CPUs? Unfortunately, for now I don't think they can be added together 
because of what I said above about the legalization step. Do you know any 
work-around to that? Maybe I can legalize only for specific devices (e.g., only 
for Cortex-A55)? 
* Finally, as things stand now we might get this PR in, and later do a more 
detailed comparison across different networks + CPUs

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/pull/5754#issuecomment-642564985

Reply via email to