areusch commented on a change in pull request #7742:
URL: https://github.com/apache/tvm/pull/7742#discussion_r631370429



##########
File path: src/runtime/stm32/graph_runtime.cc
##########
@@ -0,0 +1,529 @@
+/*

Review comment:
       hi Arthur @stoa,
   
   Ah I think I understand what's going on here...sorry for all of this 
confusion. You are stumbling into some problems we have not yet addressed for 
production inference use cases. There are a couple of things at play here:
   
   ### Error-handling paradigm and location of `TVMAPISetLastError()`.
   
   `TVMAPISetLastError` and `TVMGetLastError` should be in `crt_backend_api.h` 
as you pointed out. However, microTVM has meanwhile adopted `tvm_crt_error_t` 
for many of its non-c_backend_api (and indeed, also in c_backend_api via calls 
to TVMPlatform*) functions. 
   
   > Returning the error codes from the crt_backend_api seems to be our 
commonly preferred way of handling errors here - I am very interested in this 
being proposed for the mTVM.
   
   I concur with this opinion and this is what I'm going for with 
`tvm_crt_error_t`. I hadn't realized the misplacement of `TVMAPISetLastError`, 
since (from my reading) it's actually only used in generated code to handle 
assert fails, and I don't think we often use those in microTVM. Given this, I 
think it would be worth clarifying error handling in microTVM with a separate 
RFC. I don't think we should tackle that in this PR--perhaps a separate PR 
could:
   1. Move `TVMAPISetLastError` and `TVMGetLastError` to `c_backend_api.h`.
   2. Propose use of `tvm_crt_error_t` as the return value from TVM API 
functions, at least in the C runtime API. We could take this a step further and 
replace the `int` return value in the C++ runtime, but it may get complicated 
and `tvm_crt_error_t` is indeed compatible with the zero-indicates-success 
paradigm currently mandated by `c_{backend,runtime}_api.h`
   3. Allocate a `tvm_crt_error_t` code to indicate assert failure, and direct 
users to `TVMGetLastError` to retrieve the error text.
   
   I am happy to take on this RFC/PR if you would be okay waiting on it; or if 
you want to do it, I'm completely fine with that. Let me know if this direction 
makes sense to you.
   
   ###  Handling `TVMBackendAllocWorkspace` errors
   
   > I am not sure I understand this point. If the memory allocation fails, we 
have 2 choices:
   > 
   > - The operator code catches it and SetLastError(), eventually with a 
detailed error code. I do not see why this is difficult or not clean - but 
perhaps I do not have enough vision here. In any case, this seems to have 
already been implemented with the TVM operators, Is this going to change ? 
Which RFC ?
   > - The crt layer catches the error and handles via further hooks into the 
application code. We prefer to minimize such hooks.
   > I am not sure why you have mentioned the CHECK_EQ macro ?
   
   Here I was examining the existing implementation of 
`TVMBackendAllocWorkspace` in `src/runtime/crt/common/crt_backend_api.c`. That 
function is currently required to return NULL on error, so we would need to 
either:
   1. change it to return `int` or `tvm_crt_error_t` to indicate errors
   2. Make it follow the same paradigm as assert fails above
   
   I prefer 1 here, since an error code can programmatically indicate what's 
wrong and is easier to log. Perhaps it's best to propose this as part of the 
RFC mentioned above.
   
   I mentioned `CHECK_EQ` because that is the existing error-handling strategy. 
I think this calls `TVMPlatformAbort`, so I'm not sure it's the right one for 
inference deployment.
   
   ### Implementation of `TVMLogf` and `TVMPlatformAbort`
   
   I agree at least `TVMLogf` should be optional in deployment. Likely, so 
should `TVMPlatformAbort`, though I think it remains to be seen how we can 
handle errors in higher-level libraries. I don't see operator code as needing 
to call `TVMPlatformAbort`, unless accelerator or other libraries need such a 
facility.
   
   In any case, I agree it's fine to implement these as dummy for your 
production inference STM32 applications--once we resolve the 
`TVMBackendAllocWorkspace` error-handling strategy question above.
   
   cc @tqchen regarding the API problems.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to