yangulei commented on PR #11642: URL: https://github.com/apache/tvm/pull/11642#issuecomment-1157507315
> Could you please be more specific about scenarios you would like to optimize? I tried zero-copy before the merging of #11345, it's not necessary now since it's already covered in your work. Thanks for your explanation about `post-op sum`. As residual structure is widely used in CV models, the performance gain from `post-op sum` is very important, so we want to use `post-op sum` if possible. I'll try to add a `reorder` to copy SRC to DST in DNNL, it will be nice if DNNL do nothing if SRC and DST are identical. Regarding to the automatic injection of required reorder primitives, I think it's useful to ensure optimal DNNL layout in runtime. But I think it's better to do the layout transform in `AlterOpLayout` before the graph is consumed by DNNL, thus we can do the layout transform in compile/build time instead of runtime. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
