giuseros commented on pull request #5754: URL: https://github.com/apache/incubator-tvm/pull/5754#issuecomment-642564985
Hi @FrozenGene Just to clarify: I am enjoying the discussion, and since the optimization space is wild, I agree that is worth valuating different approaches. * About the Raspberry+mobilenet v2, good to know you are working on Armv8-A (sorry to have assumed otherwise). However, there is still the point that mobilenet uses shallow convolutions, while I am addressing deeper and more generic convolutions. * Are you saying that, as things stand now in TVM, the `conv2d_nhwc_spatial_pack` schedule might be faster than the gemm approach on smaller CPUs? Unfortunately, for now I don't think they can be added together because of what I said above about the legalization step. Do you know any work-around to that? Maybe I can legalize only for specific devices (e.g., only for Cortex-A55)? * Finally, as things stand now we might get this PR in, and later do a more detailed comparison across different networks + CPUs ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
