junrushao1994 edited a comment on pull request #5585:
URL: https://github.com/apache/incubator-tvm/pull/5585#issuecomment-631147088


   As `CHECK` causes significant regression in stack utilization, we dived 
deeper into this case, and then figured out that `dmlc::LogMessageFatal` is the 
cause - it is super sophisticated and throws inside the destructor.
   
   So we can patch dmlc-core, move `dmlc::LogMessageFatal` from stack to heap. 
A simple trick is to wrap it with `std::unique_ptr`, because dmlc-core has been 
updated to C++ 11.
   
   Besides patching dmlc-core, we can change the implementation of 
`tvm::runtime::Array` as well as follows:
   0) no change (using CHECK)
   1) throw dmlc::Error instead of using CHECK
   2) throw std::runtime_error instead of using CHECK 
   3) remove and disable the CHECK
   
   Below are the benchmarks we did. We use `g++ -fstack-usage` to dump the 
stack usage information for the target function.
   
   Settings:
   * Compiler: g++ 5.4
   * OS: Ubuntu 18.04
   * Stack size: ulimit -s 8192
   * Target function: `virtual tvm::relay::Expr 
tvm::relay::CommonSubexprEliminator::VisitExpr_(const tvm::relay::CallNode*)`, 
which originally crashed this PR on a very deep network
   
   Results:
   
   |                          | Patch dmlc-core | No patch dmlc-core |
   |--------------------------|-----------------|--------------------|
   | CHECK                    | 560             | 4656               |
   | throw dmlc::Error        | 736             | 1264               |
   | throw std::runtime_error | 480             | 1008               |
   | No checks (no-op)                | 480             | 1008               |
   
   It is unclear right now why throwing `dmlc::Error` takes more stack space 
than `std::runtime_error`, although they intent to be identical.
   
   Actions we plan to take:
   1) Patch dmlc-core using the aforementioned unique_ptr trick;
   2) Sync this branch with dmlc-core;
   3) Bring all the checks back.
   
   We do not intend to throw `std::runtime_error` directly, although it uses 
the least amount of stack space, to avoid being overly ad-hoc in error 
throwing/catching/reporting.
   
   CC: @yidawang @yzhliu @kevinthesun 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to