Repository: incubator-systemml Updated Branches: refs/heads/master aaca80061 -> 33ebe969b
[MINOR] [DOC] Updated the documentation to clarify the common misconceptions regarding Caffe2DML and Python DSL. Project: http://git-wip-us.apache.org/repos/asf/incubator-systemml/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-systemml/commit/33ebe969 Tree: http://git-wip-us.apache.org/repos/asf/incubator-systemml/tree/33ebe969 Diff: http://git-wip-us.apache.org/repos/asf/incubator-systemml/diff/33ebe969 Branch: refs/heads/master Commit: 33ebe969ba1fd3d68941cf0e8c1299daf5df15b4 Parents: aaca800 Author: Niketan Pansare <[email protected]> Authored: Fri Jun 2 17:19:14 2017 -0800 Committer: Niketan Pansare <[email protected]> Committed: Fri Jun 2 18:19:14 2017 -0700 ---------------------------------------------------------------------- docs/beginners-guide-caffe2dml.md | 25 +++++++++++++- docs/python-reference.md | 61 ++++++++++++++++++++++++++++++++++ 2 files changed, 85 insertions(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-systemml/blob/33ebe969/docs/beginners-guide-caffe2dml.md ---------------------------------------------------------------------- diff --git a/docs/beginners-guide-caffe2dml.md b/docs/beginners-guide-caffe2dml.md index 429bce2..6d48c61 100644 --- a/docs/beginners-guide-caffe2dml.md +++ b/docs/beginners-guide-caffe2dml.md @@ -110,6 +110,29 @@ For more detail on enabling native BLAS, please see the documentation for the [n ## Frequently asked questions +#### What is the purpose of Caffe2DML API ? + +Most deep learning experts are more likely to be familiar with the Caffe's specification +rather than DML language. For these users, the Caffe2DML API reduces the learning curve to using SystemML. +Instead of requiring the users to write a DML script for training, fine-tuning and testing the model, +Caffe2DML takes as an input a network and solver specified in the Caffe specification +and automatically generates the corresponding DML. + +#### With Caffe2DML, does SystemML now require Caffe to be installed ? + +Absolutely not. We only support Caffe's API for convenience of the user as stated above. +Since the Caffe's API is specified in the protobuf format, we are able to generate the java parser files +and donot require Caffe to be installed. This is also true for Tensorboard feature of Caffe2DML. + +``` +Dml.g4 ---> antlr ---> DmlLexer.java, DmlListener.java, DmlParser.java ---> parse foo.dml +caffe.proto ---> protoc ---> target/generated-sources/caffe/Caffe.java ---> parse caffe_network.proto, caffe_solver.proto +``` + +Again, the SystemML engine doesnot invoke (or depend on) Caffe and TensorFlow for any of its runtime operators. +Since the grammar files for the respective APIs (i.e. `caffe.proto`) are used by SystemML, +we include their licenses in our jar files. + #### How can I speedup the training with Caffe2DML ? - Enable native BLAS to improve the performance of CP convolution and matrix multiplication operators. @@ -259,4 +282,4 @@ train_df.write.parquet('kaggle-cats-dogs.parquet') #### Can I use Caffe2DML via Scala ? Though we recommend using Caffe2DML via its Python interfaces, it is possible to use it by creating an object of the class -`org.apache.sysml.api.dl.Caffe2DML`. It is important to note that Caffe2DML's scala API is packaged in `systemml-*-extra.jar`. +`org.apache.sysml.api.dl.Caffe2DML`. It is important to note that Caffe2DML's scala API is packaged in `systemml-*-extra.jar`. \ No newline at end of file http://git-wip-us.apache.org/repos/asf/incubator-systemml/blob/33ebe969/docs/python-reference.md ---------------------------------------------------------------------- diff --git a/docs/python-reference.md b/docs/python-reference.md index a847964..4519bc1 100644 --- a/docs/python-reference.md +++ b/docs/python-reference.md @@ -48,6 +48,9 @@ It implements basic matrix operators, matrix functions as well as converters to types (for example: Numpy arrays, PySpark DataFrame and Pandas DataFrame). +The primary reason for supporting this API is to reduce the learning curve for an average Python user, +who is more likely to know Numpy library, rather than the DML language. + ### Operators The operators supported are: @@ -107,6 +110,64 @@ Since matrix is backed by lazy evaluation and uses a recursive Depth First Searc you may run into `RuntimeError: maximum recursion depth exceeded`. Please see below [troubleshooting steps](http://apache.github.io/incubator-systemml/python-reference#maximum-recursion-depth-exceeded) +### Dealing with the loops + +It is important to note that this API doesnot pushdown loop, which means the +SystemML engine essentially gets an unrolled DML script. +This can lead to two issues: + +1. Since matrix is backed by lazy evaluation and uses a recursive Depth First Search (DFS), +you may run into `RuntimeError: maximum recursion depth exceeded`. +Please see below [troubleshooting steps](http://apache.github.io/incubator-systemml/python-reference#maximum-recursion-depth-exceeded) + +2. Significant parsing/compilation overhead of potentially large unrolled DML script. + +The unrolling of the for loop can be demonstrated by the below example: + +```python +>>> import systemml as sml +>>> import numpy as np +>>> m1 = sml.matrix(np.ones((3,3)) + 2) + +Welcome to Apache SystemML! + +>>> m2 = sml.matrix(np.ones((3,3)) + 3) +>>> m3 = m1 +>>> for i in range(5): +... m3 = m1 * m3 + m1 +... +>>> m3 +# This matrix (mVar12) is backed by below given PyDML script (which is not yet evaluated). To fetch the data of this matrix, invoke toNumPy() or toDF() or toPandas() methods. +mVar1 = load(" ", format="csv") +mVar3 = mVar1 * mVar1 +mVar4 = mVar3 + mVar1 +mVar5 = mVar1 * mVar4 +mVar6 = mVar5 + mVar1 +mVar7 = mVar1 * mVar6 +mVar8 = mVar7 + mVar1 +mVar9 = mVar1 * mVar8 +mVar10 = mVar9 + mVar1 +mVar11 = mVar1 * mVar10 +mVar12 = mVar11 + mVar1 +save(mVar12, " ") +``` + +We can reduce the impact of this unrolling by eagerly evaluating the variables inside the loop: + +```python +>>> import systemml as sml +>>> import numpy as np +>>> m1 = sml.matrix(np.ones((3,3)) + 2) + +Welcome to Apache SystemML! + +>>> m2 = sml.matrix(np.ones((3,3)) + 3) +>>> m3 = m1 +>>> for i in range(5): +... m3 = m1 * m3 + m1 +... sml.eval(m3) + +``` ### Built-in functions
