Very good document, thanks!

 One issue with approach 1 is that resuming the operator after the failed one 
may cause error and even system hang. Say if op A writes var V while op B reads 
V. Then B will not be excited if A is failed, unless we clear their 
dependencies, but it will lead to wrong results as well.

Best
Mu

> On Jan 19, 2018, at 10:07 AM, Anirudh <[email protected]> wrote:
> 
> Hi,
> 
> I have outlined the approach and proof of concept for Better Exception
> Handling in MXNet. Please provide feedback/comments/suggestions in the
> comments section of the wiki.
> 
> https://cwiki.apache.org/confluence/display/MXNET/Improved+exception+handling+in+MXNet
> 
> 
> Note: Responses will be delayed till 01/22/2018.
> 
> Anirudh

Reply via email to