This is an automated email from the ASF dual-hosted git repository.

jxie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
     new 7d08810  Exception handling documentation (#9869)
7d08810 is described below

commit 7d0881036959f383bab7efa22a80a9caa419274c
Author: Anirudh Subramanian <anirudh2...@gmail.com>
AuthorDate: Mon Mar 5 15:52:21 2018 -0800

    Exception handling documentation (#9869)
    
    * Add tests for Exception Handling in Iterators
    
    * Fixing test_random
    
    * Add documentation for exc handling
    
    * Fix for exc handling doc
    
    * Fix exc handling doc
    
    * Add exception handling documentation
    
    * Correct the seed change
    
    * Fix
    
    * Improve exception handling docs
    
    * Add dmlc-core
    
    * Empty commit
    
    * Add dmlc-core
    
    * Move to architecture design docs
    
    * Add exception handling to index
    
    * Trigger CI
---
 dmlc-core                               |   2 +-
 docs/architecture/exception_handling.md | 111 ++++++++++++++++++++++++++++++++
 docs/architecture/index.md              |   1 +
 tests/python/unittest/test_io.py        |  24 +++++++
 4 files changed, 137 insertions(+), 1 deletion(-)

diff --git a/dmlc-core b/dmlc-core
index a1fd683..282b986 160000
--- a/dmlc-core
+++ b/dmlc-core
@@ -1 +1 @@
-Subproject commit a1fd6834c0cd3fd2cc586deec2dc24194924cada
+Subproject commit 282b98663f59df6b26f906580af610dea3046f22
diff --git a/docs/architecture/exception_handling.md 
b/docs/architecture/exception_handling.md
new file mode 100644
index 0000000..5b4448a
--- /dev/null
+++ b/docs/architecture/exception_handling.md
@@ -0,0 +1,111 @@
+# Exception Handling in MXNet
+
+This tutorial explains the exception handling support in MXNet, 
+and provides examples on how to throw and handle exceptions when in a 
multithreaded context.
+Although, the examples are in Python, they can be easily extended to MXNet
+language bindings.
+
+MXNet exceptions can be thrown from two areas:
+- MXNet main thread. For eg. Infershape and InferType.
+- Spawned threads:
+    * By dependency engine for operator execution in parallel
+    * By the iterators, during the data loading, text parsing phase etc.
+
+In the first case, the exception is thrown and can be handled in the main 
thread.
+In the second case, the exception is thrown in a spawned thread, caught and 
transported to the
+main thread, where it is rethrown. This tutorial will give more explanation 
and examples on how 
+to handle exceptions for the second case.
+
+## Prerequisites 
+
+To complete this tutorial, we need:
+- MXNet 
[7b24137](https://github.com/apache/incubator-mxnet/commit/7b24137ed45df605defa4ce72ec91554f6e445f0).
 See Instructions in [Setup and 
Installation](http://mxnet.io/install/index.html).
+
+## Exception Handling for Iterators
+
+The below example shows how to handle exceptions for iterators. In this 
example,
+we populate files for data and labels with fewer number of labels compared to 
the
+number of samples. This should throw an exception.
+
+CSVIter uses PrefetcherIter for loading and parsing data.
+The PrefetcherIter spawns a producer thread in the background which prefetches
+the data while the main thread consumes the data. The exception is thrown in 
the spawned
+producer thread during the prefetching, when the label is not found 
corresponding to a specific sample. 
+
+The exception is transported to the main thread, where it is rethrown when 
Next is 
+called as part of the following line: `for batch in iter(data_train)`.
+
+In general, Exception may be rethrown as part of `Next` and `BeforeFirst` 
calls which correspond to `reset()` and `next()` methods in `MXDataIter` for 
Python language bindings.
+
+```python
+import os
+import mxnet as mx
+
+cwd = os.getcwd()
+data_path = os.path.join(cwd, "data.csv")
+label_path = os.path.join(cwd, "label.csv")
+
+with open(data_path, "w") as fout:
+    for i in range(8):
+        fout.write("1,2,3,4,5,6,7,8,9,10\n")
+
+with open(label_path, "w") as fout:
+    for i in range(7):
+        fout.write("label"+str(i))
+
+try:
+    data_train = mx.io.CSVIter(data_csv=data_path, label_csv=label_path, 
data_shape=(1, 10),
+                               batch_size=4)
+
+    for batch in iter(data_train):
+        print(data_train.getdata().asnumpy())
+except mx.base.MXNetError as ex:
+    print("Exception handled")
+    print(ex)
+```
+
+### Limitation
+
+There is a race condition when your last `next()` call doesnt reach the batch 
in your dataset where exception occurs. Exception may or may not be thrown in 
this case depending on which thread wins the race. To avoid this situation, you 
should try and iterate through your full dataset if you think it can throw 
exceptions which need to be handled.
+
+
+## Exception Handling for Operators
+
+The below example shows how to handle exceptions for operators in the 
imperative mode.
+
+For the operator case, the dependency engine spawns a number of threads if it 
is running in the `ThreadedEnginePool` or `ThreadedEnginePerDevice` mode. The 
final operator is executed in one of the spawned threads. 
+
+If an operator throws an exception during execution, this exception is 
propagated
+down the dependency chain. Once there is a synchronizing call i.e. WaitToRead 
for a variable in the dependency chain, the propagated exception is rethrown. 
+
+In the below example, I illustrate how an exception that occured in the first 
line is propagated down the dependency chain, and finally is rethrown when we 
make a synchronizing call to WaitToRead.
+
+```python
+import mxnet as mx
+a = mx.nd.random.normal(0, 1, (2, 2))
+b = mx.nd.random.normal(0, 2, (2, 2))
+c = mx.nd.dot(a, b)
+d = mx.nd.random.normal(0, -1, (2, 2))
+e = mx.nd.dot(c, d)
+e.wait_to_read()
+```
+
+Although the above exception occurs when executing the operation which writes 
to the variable d in one of the child threads, it is thrown only when the 
synchronization happens as part of the line: `e.wait_to_read()`.
+
+Let us take another example. In the following case, we write to two variables 
and then `wait_to_read` for both. This example shows that any particular 
exception will not be thrown more than once.
+
+```python
+import mxnet as mx
+a = mx.nd.random.normal(0, 1, (2, 2))
+b = mx.nd.random.normal(0, -1, (2, 2))
+c, d  = mx.nd.dot(a, b)
+try:
+    c.asnumpy()
+except mx.base.MXNetError as ex:
+    print("Exception handled")    
+d.asnumpy()
+```
+
+### Limitation
+
+Rethrowing exceptions as part of `mx.nd.waitall` is not supported. So if your 
code executes a few operators and then calls `waitall` instead of 
`wait_to_read`/`asnumpy`, the exception will disappear. Please avoid waitalls 
in your code unless you are confident about your code not throwing exception in 
any scenario.
diff --git a/docs/architecture/index.md b/docs/architecture/index.md
index 7a8ec3d..91fb5f5 100644
--- a/docs/architecture/index.md
+++ b/docs/architecture/index.md
@@ -20,3 +20,4 @@ Additionally, we provide an overview of the complete MXNet 
system.
 * [Dependency Engine for Deep 
Learning](http://mxnet.io/architecture/note_engine.html)
 * [Optimizing the Memory Consumption in Deep 
Learning](http://mxnet.io/architecture/note_memory.html)
 * [Efficient Data Loading Module for Deep 
Learning](http://mxnet.io/architecture/note_data_loading.html)
+* [Exception Handling in 
MXNet](http://mxnet.io/architecture/exception_handling.html)
diff --git a/tests/python/unittest/test_io.py b/tests/python/unittest/test_io.py
index 58ca1d7..4e23a22 100644
--- a/tests/python/unittest/test_io.py
+++ b/tests/python/unittest/test_io.py
@@ -18,6 +18,7 @@
 # pylint: skip-file
 import mxnet as mx
 from mxnet.test_utils import *
+from mxnet.base import MXNetError
 import numpy as np
 import os, gzip
 import pickle as pickle
@@ -249,8 +250,31 @@ def test_LibSVMIter():
             assert(num_batches == int(expected_num_batches)), num_batches
             data_train.reset()
 
+    def check_libSVMIter_exception():
+        cwd = os.getcwd()
+        data_path = os.path.join(cwd, 'data.t')
+        label_path = os.path.join(cwd, 'label.t')
+        with open(data_path, 'w') as fout:
+            fout.write('1.0 0:0.5 2:1.2\n')
+            fout.write('-2.0\n')
+            # Below line has a neg indice. Should throw an exception
+            fout.write('-3.0 -1:0.6 1:2.4 2:1.2\n')
+            fout.write('4 2:-1.2\n')
+
+        with open(label_path, 'w') as fout:
+            fout.write('1.0\n')
+            fout.write('-2.0 0:0.125\n')
+            fout.write('-3.0 2:1.2\n')
+            fout.write('4 1:1.0 2:-1.2\n')
+        data_dir = os.path.join(cwd, 'data')
+        data_train = mx.io.LibSVMIter(data_libsvm=data_path, 
label_libsvm=label_path,
+                                      data_shape=(3, ), label_shape=(3, ), 
batch_size=3)
+        for batch in iter(data_train):
+            data_train.get_data().asnumpy()
+
     check_libSVMIter_synthetic()
     check_libSVMIter_news_data()
+    assertRaises(MXNetError, check_libSVMIter_exception)
 
 
 def test_DataBatch():

-- 
To stop receiving notification emails like this one, please contact
j...@apache.org.

Reply via email to