comaniac commented on a change in pull request #59: URL: https://github.com/apache/tvm-rfcs/pull/59#discussion_r820002751
########## File path: rfcs/0059-remove-tophub-from-default.md ########## @@ -0,0 +1,57 @@ +- Feature Name: remove_tophub_from_default +- Start Date: 2021-03-04 +- RFC PR: [apache/tvm-rfcs#0059](https://github.com/apache/tvm-rfcs/pull/0059) +- GitHub Issue: [apache/tvm#10474](https://github.com/apache/tvm/pull/10474) + +# Summary +[summary]: #summary + +Remove tophub from being the default autotvm configuration when compiling with +the VM or graph executor. This bring thes VM and graph executor compilation +workflow in line with all other workflows which do no use tophub by default. + +https://github.com/apache/tvm/pull/10474 provides a short discussion of the +tradeoffs/issues in applying this change. + +# Motivation +[motivation]: #motivation + +- Tophub is not maintained and is out of date in some cases. Users should + explicitly use it so there is no surprising behavior in using out of date + schedules. +- Tophub is implicitly used only with the VM and graph executor compilation + flows. Other flows (like `VMCompiler.optimize`) will return different + results because they do not implicitly use tophub. +- Many default schedules exist in TVM, but they are not used because the + tophub schedules override them. Either they should be removed or used. +- Compilation implicitly depends on scheduling choices located outside of the + TVM codebase. This implicit behavior is confusing when debugging and + requires us to maintain a consistent state between two different repos. + +# Drawbacks +[drawbacks]: #drawbacks + +We may see a performance degradation in models compiled without tuning as the +tophub schedules may be more optimized then the fallback/untuned schedules. If +this is an issue, we can port the poorly performing schedules from tophub to +the fallback (i.e. default) configuration for the op. + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +Alternative choices: + +- Use tophub as the default context when no other context has been specified. + This will make `VMCompiler.optimize` and `VMCompiler.lower` consistent in + their results. + +- Add tophub to `VMCompiler.optimize`. This doesn't solve the underlying + problem. I expect us see more inconsistent results in other parts of the + codebase where tophub is not used. + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +Will there be any actual performance impact in removing tophub from being the default? Review comment: Based on the CI, at least we know that the workloads in MobileNet on GPU require TopHub schedules to be compiled. The default schedules are invalid for some GPUs (e.g., the one used in CI) and will result in compilation failure. Given this fact, we cannot guarantee default schedules are valid for all workloads on GPU, and TopHub could more or less help complement this issue. Thus, if we go with this RFC, the concern I may have is how should we port the TopHub schedule to be the default one? I could imagine that we may end up having a decision-tree like logic for each op to dispatch the desired default schedule. This may make TOPI even more ad-hoc and complicate. I would definitely like to hear others feedback about this. ########## File path: rfcs/0059-remove-tophub-from-default.md ########## @@ -0,0 +1,57 @@ +- Feature Name: remove_tophub_from_default +- Start Date: 2021-03-04 +- RFC PR: [apache/tvm-rfcs#0059](https://github.com/apache/tvm-rfcs/pull/0059) +- GitHub Issue: [apache/tvm#10474](https://github.com/apache/tvm/pull/10474) + +# Summary +[summary]: #summary + +Remove tophub from being the default autotvm configuration when compiling with +the VM or graph executor. This bring thes VM and graph executor compilation +workflow in line with all other workflows which do no use tophub by default. + +https://github.com/apache/tvm/pull/10474 provides a short discussion of the +tradeoffs/issues in applying this change. + +# Motivation +[motivation]: #motivation + +- Tophub is not maintained and is out of date in some cases. Users should + explicitly use it so there is no surprising behavior in using out of date + schedules. +- Tophub is implicitly used only with the VM and graph executor compilation + flows. Other flows (like `VMCompiler.optimize`) will return different + results because they do not implicitly use tophub. +- Many default schedules exist in TVM, but they are not used because the + tophub schedules override them. Either they should be removed or used. +- Compilation implicitly depends on scheduling choices located outside of the + TVM codebase. This implicit behavior is confusing when debugging and + requires us to maintain a consistent state between two different repos. + +# Drawbacks +[drawbacks]: #drawbacks + +We may see a performance degradation in models compiled without tuning as the +tophub schedules may be more optimized then the fallback/untuned schedules. If +this is an issue, we can port the poorly performing schedules from tophub to +the fallback (i.e. default) configuration for the op. + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +Alternative choices: + +- Use tophub as the default context when no other context has been specified. + This will make `VMCompiler.optimize` and `VMCompiler.lower` consistent in + their results. + +- Add tophub to `VMCompiler.optimize`. This doesn't solve the underlying + problem. I expect us see more inconsistent results in other parts of the + codebase where tophub is not used. + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +Will there be any actual performance impact in removing tophub from being the default? + +Should we provide a deprecation notice, or a change notice, or no notice at all? Review comment: If we go with this RFC, I vote for a WARNING in graph executor saying that TopHub is no longer applied automatically, please xxx (provide a link or just describe how to bring TopHub back if users want). We could further discuss how long should the warning be. Formally, it should be there for at least one release cycle (0.9 -> 1.0), but we could make it shorter if everyone agrees. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
