This is an automated email from the ASF dual-hosted git repository.
lausen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git
The following commit(s) were added to refs/heads/master by this push:
new f7c4323 Dynamic subgraph compile support (#17623)
f7c4323 is described below
commit f7c43234d08d4b3a9401f2d5ffc1a98795765ad5
Author: Sam Skalicky <[email protected]>
AuthorDate: Thu Mar 19 12:38:59 2020 -0700
Dynamic subgraph compile support (#17623)
This PR adds support for passing the NDArrays from the existing
optimize_for API down to the reviewSubgraph function in an external library. It
also adds a new API for HybridBlock called optimize_for that can partition the
model without running a forward pass.
Feature changes
Adds new API to HybridBlock optimize_for that partitions the model but
does not call the cachedOp
Modifies the subgraph library example to optionally require args to be
provided
Adds annotation on subgraph inputs for the name of the original param
so that inputs can be mapped and passes annotations to input nodes of subgraphs
Adds support for tensors in MKLDNN format, calls Reorder2Default
New tests
Adds a new test to partition operators that directly consume params
add a new model to test where ops to be partitioned have args/params
Bug Fixes
fixes bug in passing ids vector by value instead of by reference
fixes bug in passing copies of attributes instead of by reference
fixes bug where _cached_graph was not updated after partitioning
fixes memory leak where user-specified attributes on subgraph ops were
not freed if subgraph was rejected
fixes problem incorrectly indexing into shape/dtype maps when
annotating the graph
Docs
Updates the README doc with the latest changes described above
---
example/extensions/lib_subgraph/README.md | 69 ++++--
example/extensions/lib_subgraph/subgraph_lib.cc | 41 +++-
example/extensions/lib_subgraph/test_subgraph.py | 62 +++++-
include/mxnet/c_api.h | 4 +-
include/mxnet/lib_api.h | 107 +++++++--
perl-package/AI-MXNetCAPI/mxnet.i | 2 +
python/mxnet/gluon/block.py | 71 +++++-
python/mxnet/symbol/symbol.py | 23 +-
src/c_api/c_api.cc | 57 +++--
src/c_api/c_api_symbolic.cc | 70 ++++--
src/operator/subgraph/build_subgraph.cc | 9 +-
.../partitioner/custom_subgraph_property.h | 247 ++++++++++++++++++---
src/operator/subgraph/subgraph_property.h | 8 +
tests/python/unittest/test_extensions.py | 14 ++
tests/python/unittest/test_subgraph_op.py | 4 +-
15 files changed, 657 insertions(+), 131 deletions(-)
diff --git a/example/extensions/lib_subgraph/README.md
b/example/extensions/lib_subgraph/README.md
index b113be2..83c8236 100644
--- a/example/extensions/lib_subgraph/README.md
+++ b/example/extensions/lib_subgraph/README.md
@@ -53,9 +53,11 @@ You can start getting familiar with custom partitioners by
running an example pr
* **lib_subgraph/test_subgraph.py**: This file calls
`mx.library.load(‘libsubgraph_lib.so’)` to load the library containing the
custom components, partitions the model using the `optimize_for` API, and
prints outputs of the forward passes. The outputs should be the same as the
regular MXNet forward pass without partitioning.
+* **include/mxnet/lib_api.h**: This file from MXNet source code is the single
header file needed to include all necessary data types and function prototypes
for writing a custom operator library. You can either specify the include path
in the `Makefile`, or copy the header file over to
`example/extensions/lib_subgraph` folder. Note that apart from this header, the
custom operator library is independent of MXNet source.
+
## Writing Custom Partitioner Library
-For building a library containing your own custom partitioner, compose a C++
source file like `mypart_lib.cc`, include `lib_api.h` header file, and write
your custom partitioner with these essential functions:
+To build your own library containing a custom partitioner, compose a C++
source file like `mypart_lib.cc`, include `lib_api.h` header file, and write
your custom partitioner with these essential functions:
- `initialize` - Library Initialization Function
- `REGISTER_PARTITIONER ` - Partitioner Registration Macro
- `mySupportedOps ` - Operator Support
@@ -76,34 +78,60 @@ sym, _, _ = mx.model.load_checkpoint('mymodel', 0)
# Symbol/Module flow
sym2 = sym.optimize_for("myPart")
-# Gluon flow
+# Gluon flow 1
sym_block = nn.SymbolBlock(sym, inputs)
sym_block.hybridize(backend='myPart')
+
+# Gluon flow 2
+sym_block = nn.SymbolBlock(sym, inputs)
+sym_block.optimize_for(x, backend='myPart')
```
+In the Gluon hybridize flow, the model is actually hybridized during the first
inference, rather than immediately when calling `hybridize`. This
hybridize-based flow is useful if a user expects to run inference immediately
after hybridizing. But for users than just want to partition but not run a
whole forward pass, the `optimize_for` API combines the hybrdize/forward APIs
but does not run a forward pass. After calling `optimize_for` users can
`export` their model immediately without run [...]
+
### Using a Custom Partitioner Library
Partitioning APIs in MXNet are available in both Symbol and Gluon APIs. For
the Symbol API, the `optimize_for` API can be called on Symbol objects to
return a partitioned Symbol.
```
-optimize_for(backend, args=None, ctx=None, **kwargs)
+optimize_for(backend, args=None, aux=None, ctx=None, **kwargs)
```
-The `optimize_for` API takes at least 1 argument, `backend` which is a string
that identifies which backend to partition the model for. The `args` argument
is optional and takes a list of NDArray or dict of str to NDArray. It is used
to infer shapes and types and before partitioning. The `ctx` argument is
optional and takes a device context to infer storage types. It also take any
other user-specified options that will be passed to the backend partitioning
APIs.
+The `optimize_for` API takes at least 1 argument, `backend` which is a string
that identifies which backend to partition the model for. The `args` and `aux`
arguments are optional and take a list of NDArray or dict of str to NDArray.
They are used to infer shapes and types and before partitioning, and passed to
the backend to use during compilation. The `ctx` argument is optional and takes
a device context to infer storage types. It also takes any other user-specified
options that will b [...]
For the Gluon API, the `hybridize` API can be called on HybridBlocks to
partition the internal CachedOp Symbol.
```
-hybridize(backend=None, backend_opts=None)
+hybridize(backend=None, backend_opts=None, **kwargs)
+```
+
+The `hybridize` function prepares the HybridBlock to be converted into a
backend symbol. The `backend` argument is a string that identifies which
backend that will partition the model. The `backend_opts` takes other
user-specified options that will be passed to the backend partitioning APIs.
The actual partitioning takes place during the forward pass.
+
+If you just want to partition the HybridBlock but not run a complete forward
pass, you can use the `optimize_for` API that combines the work done in the
`hybridize` API with part of the work done in the forward pass.
+
+```
+optimize_for(x, backend=None, backend_opts=None, **kwargs)
+```
+
+When the `optimize_for` API is called on a HybridBlock it partitions
immediately. This lets users export the partitioned model without running a
complete forward pass.
+
+```
+block.optimize_for(x, backend='myPart')
+block.export('partitioned')
```
-When the `hybridize` function is called, Gluon will convert the program’s
execution into the style used in symbolic programming. The `backend` argument
is a string that identifies which backend to partition the model for. The
`backend_opts` takes other user-specified options that will be passed to the
backend partitioning APIs.
+But you can also use `optimize_for` in place of `hybridize` and run inference
immediately after too.
+
+```
+block.optimize_for(x, backend='myPart')
+block(x)
+```
### Writing A Custom Partitioner
There are several essential building blocks for making a custom partitioner:
-* [initialize](./subgraph_lib.cc#L242):
+* [initialize](./subgraph_lib.cc#L261):
* This function is the library initialization function necessary for any
dynamic libraries. It lets you check if the user is using a compatible version
of MXNet. Note that this `version` parameter is passed from MXNet when library
is loaded.
MXReturnValue initialize(int version)
@@ -116,40 +144,37 @@ There are several essential building blocks for making a
custom partitioner:
std::vector<bool>& ids,
std::unordered_map<std::string, std::string>& options)
-* [REGISTER_PARTITIONER(my_part_name)](./subgraph_lib.cc#L238):
+* [REGISTER_PARTITIONER(my_part_name)](./subgraph_lib.cc#L257):
* This macro registers the custom partitioner and its properties to MXNet
by its name. Notice that a partitioner can have multiple partitioning
strategies. This enables multiple *passes* to be run in a single partitioning
call from the user. The first argument to `addStrategy` is a user-specified
name. The second argument is the `supportedOps` function. The third argument is
the name of the subgraph operator to create for each subgraph created during
partitioning (see below for more [...]
REGISTER_PARTITIONER(my_part_name)
- .addStrategy("strategy1",
- supportedOps,
- "_custom_subgraph_op")
- .setReviewSubgraph("strategy1",
- reviewSubgraph);
+ .addStrategy("strategy1", supportedOps, "_custom_subgraph_op")
+ .setReviewSubgraph("strategy1", reviewSubgraph);
Also there are some optional functions you can specify:
-* [reviewSubgraph](./subgraph_lib.cc#L220):
+* [reviewSubgraph](./subgraph_lib.cc#L219):
* This function provides an opportunity to accept/reject a subgraph after
MXNet partitions it. It also allows specifying custom attributes on the
subgraph (ie. user-generated IDs). If you do not register this function,
subgraphs will be accepted by default.
MXReturnValue reviewSubgraph(
std::string json,
- int subraph_id,
+ int subgraph_id,
bool* accept,
- std::unordered_map<std::string,
- std::string>& options,
- std::unordered_map<std::string,
- std::string>& attrs)
+ std::unordered_map<std::string, std::string>& options,
+ std::unordered_map<std::string, std::string>& attrs,
+ std::map<std::string, MXTensor>& args,
+ std::map<std::string, MXTensor>& aux)
Let’s take a closer look at those registry functions:
-* **supportedOps**: This function takes four arguments. The 1st argument is a
JSON string of the model architecture graph, where nodes are
inputs/params/weights and edges are data dependencies. The graph is pre-sorted
in topological order. The 2nd argument is an array of booleans, one for each
operator in the model. When traversing the graph, operators to be partitioned
into subgraphs are identified and an entry is set to `true` for the node ID in
the `ids` array. The last argument is th [...]
+* **supportedOps**: This function takes four arguments. The 1st argument is a
JSON string of the model architecture graph, where nodes are
inputs/params/weights and edges are data dependencies. The graph is pre-sorted
in topological order. The 2nd argument is an array of booleans, one for each
operator in the model. When traversing the graph, operators to be partitioned
into subgraphs are identified and an entry is set to `true` for the index in
the `ids` array corresponding to the node [...]
-* **reviewSubgraph**: This function takes five arguments. The 1st argument is
a JSON string of the newly partitioned subgraph. The 2nd argument is the
subgraph ID, this is just a number MXNet uses to identify this particular
subgraph (it starts at zero and increments). The 3rd argument is an output to
be set in this function to tell MXNet whether to accept (value: `true`) or
reject (value: `false`) the subgraph. The 4th argument is the map of options
specified by the user. The last argum [...]
+* **reviewSubgraph**: This function takes five arguments. The 1st argument is
a JSON string of the newly partitioned subgraph. The 2nd argument is the
subgraph ID, this is just a number MXNet uses to identify this particular
subgraph (it starts at zero and increments, unique for each subgraph in the
model). The 3rd argument is an output to be set in this function to tell MXNet
whether to accept (value: `true`) or reject (value: `false`) the subgraph. You
might want to reject a subgraph i [...]
### Writing A Custom Subgraph Operator
-A partitioning strategy specifies how to partition a model and isolate
operators into subgraphs. In MXNet, subgraphs are just a [stateful
operator](../lib_custom_op#writing-stateful-custom-operator). Subgraph
operators have an extra attribute called `SUBGRAPH_SYM_JSON` that maps to a
JSON string of the subgraph. The expectation is that when a subgraph operator
executes a forward/backward call, it executes all of the operators in the
subgraph.
+A partitioning strategy specifies how to partition a model and isolate
operators into subgraphs. In MXNet, subgraphs are just a [stateful
operator](../lib_custom_op#writing-stateful-custom-operator). Subgraph
operators have an extra attribute called `MX_STR_SUBGRAPH_SYM_JSON` that maps
to a JSON string of the subgraph. The expectation is that when a subgraph
operator executes a forward/backward call, it executes all of the operators in
the subgraph.
When registering a custom subgraph operator, all thats needed is to register a
`createOpState` function and to set that the operator is a subgraph operator by
calling the `setIsSubgraphOp` API like:
diff --git a/example/extensions/lib_subgraph/subgraph_lib.cc
b/example/extensions/lib_subgraph/subgraph_lib.cc
index da888fd..8c24dd8 100644
--- a/example/extensions/lib_subgraph/subgraph_lib.cc
+++ b/example/extensions/lib_subgraph/subgraph_lib.cc
@@ -160,11 +160,11 @@ MXReturnValue createOpState(std::map<std::string,
std::string> attrs,
std::string serialized_subgraph = "[empty]";
// MXNet subgraph is stored as Symbol in operator node attrs subgraphs field
// custom subgraph is stored as json string in custom operator attrs map
entry
- if (attrs.count(SUBGRAPH_SYM_JSON)) {
+ if (attrs.count(MX_STR_SUBGRAPH_SYM_JSON)) {
// user can now parse json and run other custom ops inside subgraph
- serialized_subgraph = attrs[SUBGRAPH_SYM_JSON];
+ serialized_subgraph = attrs[MX_STR_SUBGRAPH_SYM_JSON];
}
- attrs.erase(SUBGRAPH_SYM_JSON);
+ attrs.erase(MX_STR_SUBGRAPH_SYM_JSON);
*op_inst = new MyStatefulOp(serialized_subgraph, attrs);
std::cout << "Info: stateful operator created" << std::endl;
return MX_SUCCESS;
@@ -177,7 +177,7 @@ REGISTER_OP(_custom_subgraph_op)
const std::vector<std::string> op_names({"exp","log"});
MXReturnValue mySupportedOps(std::string json,
- std::vector<bool> ids,
+ std::vector<bool>& ids,
std::unordered_map<std::string, std::string>&
options) {
for (auto kv : options) {
std::cout << "option: " << kv.first << " ==> " << kv.second << std::endl;
@@ -204,8 +204,8 @@ MXReturnValue mySupportedOps(std::string json,
dtype = std::stoi(attrs.map[JsonVal("dtype")].str);
}
- //check if op dtype is float
- if(dtype == kFloat32) {
+ //check if op dtype is float, and if option was specified to require float
types
+ if((dtype == kFloat32 && options.count("reqFloat") > 0) ||
options.count("reqFloat") == 0) {
//check if op is in whitelist
if(std::find(op_names.begin(),op_names.end(),op.str.c_str()) !=
op_names.end()) {
// found op in whitelist, set value to 1 to include op in subgraph
@@ -216,14 +216,34 @@ MXReturnValue mySupportedOps(std::string json,
return MX_SUCCESS;
}
-MXReturnValue myReviewSubgraph(std::string json, int subraph_id, bool* accept,
+MXReturnValue myReviewSubgraph(std::string json, int subgraph_id, bool* accept,
std::unordered_map<std::string, std::string>&
options,
- std::unordered_map<std::string, std::string>&
attrs) {
+ std::unordered_map<std::string, std::string>&
attrs,
+ std::map<std::string, MXTensor>& args,
+ std::map<std::string, MXTensor>& aux) {
for (auto kv : options) {
std::cout << "option: " << kv.first << " ==> " << kv.second << std::endl;
}
- if(options.find("reject") != options.end() &&
- options["reject"].compare("True") == 0) {
+ for (auto kv : args) {
+ std::cout << "arg: " << kv.first << " ==> (";
+ for (auto s : kv.second.shape)
+ std::cout << s << ",";
+ std::cout << ") [";
+ for (int i=0; i<kv.second.size(); i++)
+ std::cout << kv.second.data<float>()[i] << ", ";
+ std::cout << "]" << std::endl;
+ }
+
+ // check if option `reqArgs` was specified, and if so check if args were
provided
+ if(options.count("reqArgs") > 0 && args.size() == 0) {
+ *accept = false;
+ std::cout << "rejecting subgraph since args were not provided" <<
std::endl;
+ return MX_SUCCESS;
+ }
+
+ // check if option `reject` was specified, and if so check if value is 'True'
+ if(options.count("reject") > 0 && options["reject"].compare("True") == 0) {
+ // if specified, reject the subgraph. this is only used for testing
*accept = false;
std::cout << "rejecting subgraph" << std::endl;
} else {
@@ -231,7 +251,6 @@ MXReturnValue myReviewSubgraph(std::string json, int
subraph_id, bool* accept,
std::cout << "accepting subgraph" << std::endl;
attrs["myKey"] = "myVal";
}
- std::cout << json << std::endl;
return MX_SUCCESS;
}
diff --git a/example/extensions/lib_subgraph/test_subgraph.py
b/example/extensions/lib_subgraph/test_subgraph.py
index 1bcecae..55a4051 100644
--- a/example/extensions/lib_subgraph/test_subgraph.py
+++ b/example/extensions/lib_subgraph/test_subgraph.py
@@ -23,8 +23,10 @@
# This test checks if dynamic loading of library into MXNet is successful
# and checks the end of end computation of custom operator
-import mxnet as mx
import os, ctypes
+import mxnet as mx
+from mxnet.gluon import nn
+from mxnet import nd
from mxnet.base import _LIB, check_call, mx_uint, c_str, c_str_array,
SymbolHandle
# load library
@@ -35,6 +37,10 @@ elif (os.name=='nt'):
path = os.path.abspath('libsubgraph_lib.dll')
mx.library.load(path)
+###############################################
+# Test with subgraph not consuming params
+###############################################
+# example model, ops to be partitioned do not have args (use outputs from
other ops as inputs)
a = mx.sym.var('a')
b = mx.sym.var('b')
c = a + b
@@ -75,9 +81,6 @@ exe3 = mysym3.bind(ctx=mx.cpu(), args={'a':mx.nd.ones((3,2)),
'b':mx.nd.ones((3,
out3 = exe3.forward()
print(out3)
-from mxnet.gluon import nn
-from mxnet import nd
-
# Gluon Hybridize partitioning with shapes/types
print('-------------------------------')
print('Testing Gluon Hybridize partitioning with shapes/types')
@@ -88,3 +91,54 @@ sym_block.hybridize(backend='myProp')
out4 = sym_block(mx.nd.ones((3,2)),mx.nd.ones((3,2)))
print(out4)
+# Gluon Hybridize partitioning with shapes/types without inference
+print('-------------------------------')
+print('Testing Gluon Hybridize partitioning with shapes/types without
inference')
+inputs = [a,b]
+sym_block2 = nn.SymbolBlock(sym, inputs)
+sym_block2.initialize()
+sym_block2.optimize_for(mx.nd.ones((3,2)), mx.nd.ones((3,2)), backend='myProp')
+sym_block2.export('partitioned')
+
+
+###############################################
+# Test with subgraph directly consuming params
+###############################################
+# example model, ops to be partitioned have args
+d2 = mx.sym.exp(a)
+sym2 = mx.sym.log(d2)
+
+#execute in MXNet
+print('-------------------------------')
+print('Testing regular MXNet execution')
+exe5 = sym2.bind(ctx=mx.cpu(), args={'a':mx.nd.ones((3,2))})
+out5 = exe5.forward()
+print(out5)
+
+# with propogating shapes/types
+print('-------------------------------')
+print('Testing partitioning with shapes/types')
+arg_array = [mx.nd.ones((3,2),dtype='float32')]
+mysym6 = sym2.optimize_for("myProp", arg_array, reqArgs=True)
+print(mysym6.tojson())
+exe6 = mysym6.bind(ctx=mx.cpu(), args={'a':mx.nd.ones((3,2))})
+out6 = exe6.forward()
+print(out6)
+
+# without propogating shapes/types
+print('-------------------------------')
+print('Testing partitioning without shapes/types')
+mysym7 = sym2.optimize_for("myProp", reqArgs=True)
+exe7 = mysym7.bind(ctx=mx.cpu(), args={'a':mx.nd.ones((3,2))})
+out7 = exe7.forward()
+print(out7)
+
+# Gluon Hybridize partitioning with shapes/types
+print('-------------------------------')
+print('Testing Gluon Hybridize partitioning with shapes/types')
+inputs = [a]
+sym2_block = nn.SymbolBlock(sym2, inputs)
+sym2_block.initialize()
+sym2_block.hybridize(backend='myProp')
+out8 = sym2_block(mx.nd.ones((3,2)))
+print(out8)
diff --git a/include/mxnet/c_api.h b/include/mxnet/c_api.h
index efa0033..637b31d 100644
--- a/include/mxnet/c_api.h
+++ b/include/mxnet/c_api.h
@@ -2177,8 +2177,10 @@ MXNET_DLL int MXOptimizeForBackend(SymbolHandle
sym_handle,
const char* backend_name,
const int dev_type,
SymbolHandle* ret_sym_handle,
- const mx_uint len,
+ const mx_uint args_len,
NDArrayHandle* in_args_handle,
+ const mx_uint aux_len,
+ NDArrayHandle* in_aux_handle,
const mx_uint num_options,
const char** keys,
const char** vals);
diff --git a/include/mxnet/lib_api.h b/include/mxnet/lib_api.h
index d59f2b1..9b32122 100644
--- a/include/mxnet/lib_api.h
+++ b/include/mxnet/lib_api.h
@@ -39,7 +39,7 @@
#include <utility>
#include <stdexcept>
-#define MX_LIBRARY_VERSION 3
+#define MX_LIBRARY_VERSION 4
/*!
* \brief For loading multiple custom op libraries in Linux, exporting same
symbol multiple
@@ -234,10 +234,15 @@ enum MXReturnValue {
*/
struct MXTensor {
MXTensor() : data_ptr(nullptr), dtype(kUNSET), verID(0) {}
-
+ MXTensor(const MXTensor& oth) : data_ptr(oth.data_ptr), shape(oth.shape),
+ dtype(oth.dtype), verID(oth.verID), ctx(oth.ctx) {
+ setDLTensor();
+ }
MXTensor(void *data_ptr, const std::vector<int64_t> &shape, MXDType dtype,
size_t vID, MXContext mx_ctx)
- : data_ptr(data_ptr), shape(shape), dtype(dtype), verID(vID), ctx(mx_ctx) {}
+ : data_ptr(data_ptr), shape(shape), dtype(dtype), verID(vID), ctx(mx_ctx) {
+ setDLTensor();
+ }
/*! \brief populate internal tensor fields */
void setTensor(void *dptr, MXDType type, const int64_t* dims, int ndims,
@@ -406,7 +411,35 @@ class OpResource {
* \brief Json utility to parse serialized subgraph symbol
*/
/*! \brief Macro to help passing serialized subgraph through attribute dict */
-#define SUBGRAPH_SYM_JSON "subgraph_sym_json"
+#define MX_STR_SUBGRAPH_SYM_JSON "subgraph_sym_json"
+#define MX_STR_DTYPE "__dtype__"
+#define MX_STR_SHAPE "__shape__"
+
+/* \brief get shape value from list of shapes string
+ * format: [[1]] or [[1],[2]]
+ */
+std::string getShapeAt(const std::string& shape, unsigned index) {
+ int idx = 1; // start at 1 to skip the first square bracket [
+ // find the beginning of the output shape for the particular output index
+ for (unsigned x=0; x < index; x++)
+ idx = shape.find("[", idx+1);
+ int stop = shape.find("]", idx); // find stop index for this output shape
+ // add this shape to the list
+ return shape.substr(idx, stop-idx+1);
+}
+
+/* \brief get dtype value from list of dtypes string
+ * format: [1] or [1,2]
+ */
+std::string getDtypeAt(const std::string& dtype, unsigned index) {
+ // find the beginning of the output dtype for the particular output index
+ int idx = 0;
+ for (unsigned x=0; x < index; x++)
+ idx = dtype.find(",", idx+1);
+ int stop = dtype.find(",", idx+1); // find stop index for this output dtype
+ if (stop == -1) stop = dtype.find("]", idx+1);
+ return dtype.substr(idx+1, stop-idx-1);
+}
/*! \brief Types of JSON objects */
enum JsonType {ERR, STR, NUM, LIST, MAP};
@@ -713,11 +746,13 @@ class CustomOp {
};
/*! \brief Custom Subgraph Create function template */
-typedef MXReturnValue (*supportedOps_t)(std::string, std::vector<bool>,
+typedef MXReturnValue (*supportedOps_t)(std::string, std::vector<bool>&,
std::unordered_map<std::string,
std::string>&);
typedef MXReturnValue (*reviewSubgraph_t)(std::string, int, bool*,
std::unordered_map<std::string,
std::string>&,
- std::unordered_map<std::string,
std::string>&);
+ std::unordered_map<std::string,
std::string>&,
+ std::map<std::string, MXTensor>&,
+ std::map<std::string, MXTensor>&);
/*!
* \brief An abstract class for subgraph property
@@ -920,7 +955,17 @@ typedef int (*partCallSupportedOps_t)(supportedOps_t
supportedOps, const char *j
typedef int (*partCallReviewSubgraph_t)(reviewSubgraph_t reviewSubgraph, const
char *json,
int subgraph_id, int *accept, const
char* const* opt_keys,
const char* const* opt_vals, int
num_opts,
- char*** attr_keys, char*** attr_vals,
int *num_attrs);
+ char*** attr_keys, char*** attr_vals,
int *num_attrs,
+ const char* const* arg_names, int
num_args,
+ void* const* arg_data, const int64_t*
const* arg_shapes,
+ const int* arg_dims, const int*
arg_types,
+ const size_t* arg_IDs, const char*
const* arg_dev_type,
+ const int* arg_dev_id,
+ const char* const* aux_names, int
num_aux,
+ void* const* aux_data, const int64_t*
const* aux_shapes,
+ const int* aux_dims, const int*
aux_types,
+ const size_t* aux_IDs, const char*
const* aux_dev_type,
+ const int* aux_dev_id);
#define MXLIB_INITIALIZE_STR "initialize"
typedef int (*initialize_t)(int version);
@@ -1266,11 +1311,11 @@ extern "C" {
int num_ids, int *ids, const char* const* opt_keys,
const char* const* opt_vals, int num_opts) {
std::string subgraph_json(json);
- // create map of attributes from list
+ // create map of options from list
std::unordered_map<std::string, std::string> opts;
- for (int i = 0; i < num_opts; i++) {
+ for (int i = 0; i < num_opts; i++)
opts[std::string(opt_keys[i])] = std::string(opt_vals[i]);
- }
+
// create array of bools for operator support
std::vector<bool> _ids(num_ids, false);
// call user's supportedOps function
@@ -1293,19 +1338,55 @@ extern "C" {
_partCallReviewSubgraph(reviewSubgraph_t reviewSubgraph, const char *json,
int subgraph_id, int *accept, const char* const*
opt_keys,
const char* const* opt_vals, int num_opts,
- char*** attr_keys, char*** attr_vals, int
*num_attrs) {
+ char*** attr_keys, char*** attr_vals, int *num_attrs,
+ const char* const* arg_names, int num_args,
+ void* const* arg_data, const int64_t* const*
arg_shapes,
+ const int* arg_dims, const int* arg_types,
+ const size_t* arg_IDs, const char* const*
arg_dev_type,
+ const int* arg_dev_id,
+ const char* const* aux_names, int num_aux,
+ void* const* aux_data, const int64_t* const*
aux_shapes,
+ const int* aux_dims, const int* aux_types,
+ const size_t* aux_IDs, const char* const*
aux_dev_type,
+ const int* aux_dev_id) {
std::string subgraph_json(json);
bool accept_bool = false;
// create map of attributes from list
std::unordered_map<std::string, std::string> opts;
- for (int i = 0; i < num_opts; i++) {
+ for (int i = 0; i < num_opts; i++)
opts[std::string(opt_keys[i])] = std::string(opt_vals[i]);
+
+ // create a map of named tensors for args
+ std::map<std::string, MXTensor> args;
+ for (int i = 0; i < num_args; i++) {
+ std::vector<int64_t> shapes;
+ for (int j = 0; j < arg_dims[i]; j++)
+ shapes.push_back(arg_shapes[i][j]);
+
+ MXTensor tensor(arg_data[i], shapes, (MXDType)arg_types[i],
+ arg_IDs[i], {arg_dev_type[i], arg_dev_id[i]});
+ args[arg_names[i]] = tensor;
+ }
+ // create a map of named tensors for aux
+ std::map<std::string, MXTensor> aux;
+ for (int i = 0; i < num_aux; i++) {
+ std::vector<int64_t> shapes;
+ for (int j = 0; j < aux_dims[i]; j++)
+ shapes.push_back(aux_shapes[i][j]);
+
+ MXTensor tensor(aux_data[i], shapes, (MXDType)aux_types[i],
+ aux_IDs[i], {aux_dev_type[i], aux_dev_id[i]});
+ aux[aux_names[i]] = tensor;
}
+
// attributes to set on subgraph node
std::unordered_map<std::string, std::string> attrs;
- MXReturnValue retval = reviewSubgraph(subgraph_json, subgraph_id,
&accept_bool, opts, attrs);
+ MXReturnValue retval = reviewSubgraph(subgraph_json, subgraph_id,
&accept_bool,
+ opts, attrs, args, aux);
+ if (!retval) return retval;
+
*accept = accept_bool;
if (attrs.size() > 0) {
diff --git a/perl-package/AI-MXNetCAPI/mxnet.i
b/perl-package/AI-MXNetCAPI/mxnet.i
index 3bc53d6..846b28f 100644
--- a/perl-package/AI-MXNetCAPI/mxnet.i
+++ b/perl-package/AI-MXNetCAPI/mxnet.i
@@ -1633,6 +1633,8 @@ int MXOptimizeForBackend(SymbolHandle sym_handle,
const mx_uint in,
NDArrayHandle* in,
const mx_uint in,
+ NDArrayHandle* in,
+ const mx_uint in,
const char** keys,
const char** vals);
diff --git a/python/mxnet/gluon/block.py b/python/mxnet/gluon/block.py
index da76b3e..312358c 100644
--- a/python/mxnet/gluon/block.py
+++ b/python/mxnet/gluon/block.py
@@ -989,11 +989,15 @@ class HybridBlock(Block):
# get list of params in the order of out.list_arguments
arg_array = [args[data_names[name]] if name in data_names.keys()
else params[name].data()
for name in out.list_arguments()]
+ aux_array = [args[data_names[name]] if name in data_names.keys()
else params[name].data()
+ for name in out.list_auxiliary_states()]
# Partition the graph.
- out = out.optimize_for(self._backend, arg_array, ctx,
**self._backend_opts)
-
+ out = out.optimize_for(self._backend, arg_array, aux_array, ctx,
**self._backend_opts)
+ #update cached graph with partitioned graph
+ self._cached_graph = data, out
self._cached_op = ndarray.CachedOp(out, flags)
+
def _deferred_infer_shape(self, *args):
try:
self.infer_shape(*args)
@@ -1042,6 +1046,69 @@ class HybridBlock(Block):
out = [out]
return _regroup(out, self._out_format)
+ def optimize_for(self, x, *args, backend=None, backend_opts=None,
**kwargs):
+ """Partitions the current HybridBlock and optimizes it for a given
backend
+ without executing a forward pass. Modifies the HybridBlock in-place.
+
+ Immediately partitions a HybridBlock using the specified backend.
Combines
+ the work done in the hybridize API with part of the work done in the
forward
+ pass without calling the CachedOp. Can be used in place of hybridize,
+ afterwards `export` can be called or inference can be run. See
README.md in
+ example/extensions/lib_subgraph/README.md for more details.
+
+ Examples
+ --------
+ # partition and then export to file
+ block.optimize_for(x, backend='myPart')
+ block.export('partitioned')
+
+ # partition and then run inference
+ block.optimize_for(x, backend='myPart')
+ block(x)
+
+ Parameters
+ ----------
+ x : NDArray
+ first input to model
+ *args : NDArray
+ other inputs to model
+ backend : str
+ The name of backend, as registered in `SubgraphBackendRegistry`,
default None
+ backend_opts : dict of user-specified options to pass to the backend
for partitioning, optional
+ Passed on to `PrePartition` and `PostPartition` functions of
`SubgraphProperty`
+ static_alloc : bool, default False
+ Statically allocate memory to improve speed. Memory usage may
increase.
+ static_shape : bool, default False
+ Optimize for invariant input shapes between iterations. Must also
+ set static_alloc to True. Change of input shapes is still allowed
+ but slower.
+ """
+
+ # do hybrize API call
+ self.hybridize(True, backend, backend_opts, **kwargs)
+
+ # do part of forward API call
+ has_symbol, has_ndarray, ctx_set, _ = _gather_type_ctx_info([x] +
list(args))
+ if has_symbol:
+ raise ValueError('Inputs must be NDArrays for the optimize_for API'
+ ' Please check the type of the args.\n')
+ if not has_symbol and not has_ndarray:
+ raise ValueError('In HybridBlock, there must be one NDArray as
input.'
+ ' Please check the type of the args.\n')
+ if len(ctx_set) > 1:
+ raise ValueError('Find multiple contexts in the input, '
+ 'After hybridized, the HybridBlock only supports
one input '
+ 'context. You can print the ele.ctx in the '
+ 'input arguments to inspect their contexts. '
+ 'Find all contexts = {}'.format(ctx_set))
+
+ self._build_cache(x, *args)
+ assert self._cached_op, "Gluon failed to build the cache. " \
+ "This should never happen. " \
+ "Please submit an issue on Github" \
+ " https://github.com/apache/incubator-mxnet."
+ # do not actually call the cached_op
+
def _clear_cached_op(self):
self._cached_graph = ()
self._cached_op = None
diff --git a/python/mxnet/symbol/symbol.py b/python/mxnet/symbol/symbol.py
index 706152f..8e53a1a 100644
--- a/python/mxnet/symbol/symbol.py
+++ b/python/mxnet/symbol/symbol.py
@@ -1446,7 +1446,7 @@ class Symbol(SymbolBase):
return Symbol(handle)
- def optimize_for(self, backend, args=None, ctx=None, **kwargs):
+ def optimize_for(self, backend, args=None, aux=None, ctx=None, **kwargs):
"""Partitions current symbol and optimizes it for a given backend,
returns new partitioned symbol.
@@ -1462,6 +1462,13 @@ class Symbol(SymbolBase):
- If type is a dict of str to `NDArray`, then it maps the name of
arguments
to the corresponding `NDArray`.
+ aux : list of NDArray or dict of str to NDArray, optional
+ Input auxiliary arguments to the symbol
+
+ - If type is a list of `NDArray`, the order is the same as that of
`list_arguments()`.
+ - If type is a dict of str to `NDArray`, then it maps the name of
arguments
+ to the corresponding `NDArray`.
+
ctx : Context, optional
Device context, used to infer stypes
@@ -1476,13 +1483,19 @@ class Symbol(SymbolBase):
out = SymbolHandle()
assert isinstance(backend, str)
- if args is None:
+ if args is None or len(args) == 0:
args = []
args_handle = c_array(NDArrayHandle, [])
else:
- listed_arguments = self.list_arguments()
- args_handle, args = self._get_ndarray_inputs('args', args,
listed_arguments, False)
+ args_handle, args = self._get_ndarray_inputs('args', args,
+
self.list_arguments(), False)
+ if aux is None or len(aux) == 0:
+ aux = []
+ aux_handle = c_array(NDArrayHandle, [])
+ else:
+ aux_handle, aux = self._get_ndarray_inputs('aux_states', aux,
+
self.list_auxiliary_states(), False)
if ctx is None:
ctx = current_context()
assert isinstance(ctx, Context)
@@ -1498,6 +1511,8 @@ class Symbol(SymbolBase):
ctypes.byref(out),
mx_uint(len(args)),
args_handle,
+ mx_uint(len(aux)),
+ aux_handle,
mx_uint(len(key_list)),
c_str_array(key_list),
c_str_array(val_list)))
diff --git a/src/c_api/c_api.cc b/src/c_api/c_api.cc
index ad30350..db0e262 100644
--- a/src/c_api/c_api.cc
+++ b/src/c_api/c_api.cc
@@ -120,17 +120,28 @@ void CustomFComputeDispatcher(const std::string op_name,
std::vector<size_t> in_verIDs, out_verIDs;
std::vector<const char*> in_dev_type, out_dev_type;
std::vector<int> in_dev_id, out_dev_id;
+ std::vector<NDArray> conv_mkl; // converted NDArrays from MKLDNN format
// convert inputs/outpus NDArray to C types to be passed to lib_api.h
for (size_t i = 0; i < inputs.size(); i++) {
- in_data.push_back(inputs[i].data().dptr_);
- in_shapes.push_back(inputs[i].shape().data());
- in_dims.push_back(inputs[i].shape().ndim());
- in_types.push_back(inputs[i].dtype());
- in_verIDs.push_back(inputs[i].version());
- const char* ctx_str = inputs[i].ctx().dev_mask() == Context::kCPU ? "cpu"
: "gpu";
+ NDArray const* in_nd = &(inputs[i]);
+#if MXNET_USE_MKLDNN == 1
+ // reorder data if in MKLDNN format
+ if (in_nd->IsMKLDNNData()) {
+ // convert from MKLDNN
+ conv_mkl.push_back(in_nd->Reorder2Default());
+ in_nd = &(conv_mkl.back());
+ }
+#endif
+ // pull out parts to pass over to library
+ in_data.push_back(in_nd->data().dptr_);
+ in_shapes.push_back(in_nd->shape().data());
+ in_dims.push_back(in_nd->shape().ndim());
+ in_types.push_back(in_nd->dtype());
+ in_verIDs.push_back(in_nd->version());
+ const char* ctx_str = in_nd->ctx().dev_mask() == Context::kCPU ? "cpu" :
"gpu";
in_dev_type.push_back(ctx_str);
- in_dev_id.push_back(inputs[i].ctx().real_dev_id());
+ in_dev_id.push_back(in_nd->ctx().real_dev_id());
}
for (size_t i = 0; i < outputs.size(); i++) {
@@ -193,7 +204,7 @@ void CustomFComputeDispatcher(const std::string op_name,
if (fcomp_fp != nullptr) {
// convert attributes to vector of char*
std::vector<const char*> attr_keys, attr_vals;
- for (auto kv : attrs->dict) {
+ for (auto &kv : attrs->dict) {
attr_keys.push_back(kv.first.c_str());
attr_vals.push_back(kv.second.c_str());
}
@@ -361,7 +372,7 @@ int MXLoadLib(const char *path) {
auto attr_parser = [=](const NodeAttrs* attrs) {
// convert attributes to vector of char
std::vector<const char*> attr_keys, attr_vals;
- for (auto kv : attrs->dict) {
+ for (auto &kv : attrs->dict) {
attr_keys.push_back(kv.first.c_str());
attr_vals.push_back(kv.second.c_str());
}
@@ -371,7 +382,7 @@ int MXLoadLib(const char *path) {
nnvm::Graph g;
g.outputs = attrs->subgraphs[0].get()->outputs;
subgraph_json = nnvm::pass::SaveJSON(g);
- attr_keys.push_back(SUBGRAPH_SYM_JSON);
+ attr_keys.push_back(MX_STR_SUBGRAPH_SYM_JSON);
attr_vals.push_back(subgraph_json.c_str());
}
@@ -388,7 +399,7 @@ int MXLoadLib(const char *path) {
auto num_inputs = [=](const NodeAttrs& attrs) {
// convert attributes to vector of char
std::vector<const char*> attr_keys, attr_vals;
- for (auto kv : attrs.dict) {
+ for (auto &kv : attrs.dict) {
attr_keys.push_back(kv.first.c_str());
attr_vals.push_back(kv.second.c_str());
}
@@ -406,7 +417,7 @@ int MXLoadLib(const char *path) {
auto num_outputs = [=](const NodeAttrs& attrs) {
// convert attributes to vector of char*
std::vector<const char*> attr_keys, attr_vals;
- for (auto kv : attrs.dict) {
+ for (auto &kv : attrs.dict) {
attr_keys.push_back(kv.first.c_str());
attr_vals.push_back(kv.second.c_str());
}
@@ -425,7 +436,7 @@ int MXLoadLib(const char *path) {
auto num_inouts = [=](const NodeAttrs& attrs) {
// convert attributes to vector of char*
std::vector<const char*> attr_keys, attr_vals;
- for (auto kv : attrs.dict) {
+ for (auto &kv : attrs.dict) {
attr_keys.push_back(kv.first.c_str());
attr_vals.push_back(kv.second.c_str());
}
@@ -445,7 +456,7 @@ int MXLoadLib(const char *path) {
mxnet::ShapeVector *out_shape) {
// convert attributes to vector of char*
std::vector<const char*> attr_keys, attr_vals;
- for (auto kv : attrs.dict) {
+ for (auto &kv : attrs.dict) {
attr_keys.push_back(kv.first.c_str());
attr_vals.push_back(kv.second.c_str());
}
@@ -516,7 +527,7 @@ int MXLoadLib(const char *path) {
std::vector<int> *out_type) {
// convert attributes to vector of char*
std::vector<const char*> attr_keys, attr_vals;
- for (auto kv : attrs.dict) {
+ for (auto &kv : attrs.dict) {
attr_keys.push_back(kv.first.c_str());
attr_vals.push_back(kv.second.c_str());
}
@@ -544,7 +555,7 @@ int MXLoadLib(const char *path) {
auto mutate_inputs = [=](const nnvm::NodeAttrs& attrs) {
// convert attributes to vector of char*
std::vector<const char*> attr_keys, attr_vals;
- for (auto kv : attrs.dict) {
+ for (auto &kv : attrs.dict) {
attr_keys.push_back(kv.first.c_str());
attr_vals.push_back(kv.second.c_str());
}
@@ -629,7 +640,7 @@ int MXLoadLib(const char *path) {
const std::vector<int>& in_types) {
// convert attributes to vector of char*
std::vector<const char*> attr_keys, attr_vals;
- for (auto kv : attrs.dict) {
+ for (auto &kv : attrs.dict) {
attr_keys.push_back(kv.first.c_str());
attr_vals.push_back(kv.second.c_str());
}
@@ -640,7 +651,7 @@ int MXLoadLib(const char *path) {
nnvm::Graph g;
g.outputs = attrs.subgraphs[0].get()->outputs;
subgraph_json = nnvm::pass::SaveJSON(g);
- attr_keys.push_back(SUBGRAPH_SYM_JSON);
+ attr_keys.push_back(MX_STR_SUBGRAPH_SYM_JSON);
attr_vals.push_back(subgraph_json.c_str());
}
@@ -858,12 +869,10 @@ int MXLoadLib(const char *path) {
std::string op_name_str(op_name);
LOG(INFO) << "\t\tStrategy[" << j << "] " << strategy_str
<< " subgraphOp: '" << op_name_str << "'";
-
- // MXNET_REGISTER_SUBGRAPH_PROPERTY(customBackend,
CustomSubgraphProperty);
-
mxnet::op::SubgraphBackendRegistry::Get()->__REGISTER_CUSTOM_PROPERTY__(name_str,
-
std::make_shared<mxnet::op::CustomSubgraphProperty>(
- strategy_str, callSupportedOps, supportedOps_fp,
- callReviewSubgraph, reviewSubgraph_fp, callFree,
op_name_str));
+ mxnet::op::SubgraphBackendRegistry::Get()->__REGISTER_CUSTOM_PROPERTY__
+ (name_str, std::make_shared<mxnet::op::CustomSubgraphProperty>
+ (strategy_str, callSupportedOps, supportedOps_fp,
+ callReviewSubgraph, reviewSubgraph_fp, callFree, op_name_str));
}
}
API_END();
diff --git a/src/c_api/c_api_symbolic.cc b/src/c_api/c_api_symbolic.cc
index 9042dfa..4ec916d 100644
--- a/src/c_api/c_api_symbolic.cc
+++ b/src/c_api/c_api_symbolic.cc
@@ -1353,32 +1353,54 @@ int MXOptimizeForBackend(SymbolHandle sym_handle,
const char* backend_name,
const int dev_type,
SymbolHandle* ret_sym_handle,
- const mx_uint len,
+ const mx_uint args_len,
NDArrayHandle* in_args_handle,
+ const mx_uint aux_len,
+ NDArrayHandle* in_aux_handle,
const mx_uint num_options,
const char** keys,
const char** vals) {
+ // create copy of input symbol
nnvm::Symbol *s = new nnvm::Symbol();
API_BEGIN();
nnvm::Symbol *sym = static_cast<nnvm::Symbol *>(sym_handle);
*s = sym->Copy();
nnvm::Graph g = Symbol2Graph(*s);
- if (len) {
+ const auto& indexed_graph = g.indexed_graph();
+ const auto& mutable_nodes = indexed_graph.mutable_input_nodes();
+ std::vector<std::string> input_names =
sym->ListInputNames(nnvm::Symbol::kAll);
+ size_t num_forward_inputs = input_names.size();
+ if (args_len || aux_len) {
NDArray **in_args_ptr = reinterpret_cast<NDArray**>(in_args_handle);
+ NDArray **in_aux_ptr = reinterpret_cast<NDArray**>(in_aux_handle);
Context default_ctx =
Context::Create(static_cast<Context::DeviceType>(dev_type), 0);
- mxnet::ShapeVector arg_shapes(len);
- nnvm::DTypeVector arg_dtypes(len);
- StorageTypeVector arg_stypes(len);
- for (mx_uint i = 0; i < len; i++) {
- const auto &in_arg = *(in_args_ptr[i]);
- arg_shapes[i] = in_arg.shape();
- arg_dtypes[i] = in_arg.dtype();
- arg_stypes[i] = in_arg.storage_type();
+ mxnet::ShapeVector arg_shapes(args_len + aux_len);
+ nnvm::DTypeVector arg_dtypes(args_len + aux_len);
+ StorageTypeVector arg_stypes(args_len + aux_len);
+ size_t args_top = 0, aux_top = 0;
+ // loop over inputs to symbol in order and add to args/aux if mutable
+ for (size_t i = 0; i < num_forward_inputs; ++i) {
+ const uint32_t nid = indexed_graph.input_nodes().at(i);
+ if (mutable_nodes.count(nid)) {
+ CHECK_LT(aux_top, aux_len)
+ << "Cannot find aux '" << input_names[i] << "' in provided aux to
optimize_for";
+ const auto &in_arg = *(in_aux_ptr[aux_top++]);
+ arg_shapes[i] = in_arg.shape();
+ arg_dtypes[i] = in_arg.dtype();
+ arg_stypes[i] = in_arg.storage_type();
+ } else {
+ CHECK_LT(args_top, args_len)
+ << "Cannot find arg '" << input_names[i] << "' in provided args to
optimize_for";
+ const auto &in_arg = *(in_args_ptr[args_top++]);
+ arg_shapes[i] = in_arg.shape();
+ arg_dtypes[i] = in_arg.dtype();
+ arg_stypes[i] = in_arg.storage_type();
+ }
}
- const auto& indexed_graph = g.indexed_graph();
- const auto num_forward_inputs = indexed_graph.input_nodes().size();
+
g.attrs["context"] = std::make_shared<nnvm::any>(
exec::ContextVector(indexed_graph.num_nodes(), default_ctx));
+
// infer shapes
g = exec::InferShape(std::move(g), std::move(arg_shapes), "__shape__");
// infer dtypes
@@ -1393,11 +1415,31 @@ int MXOptimizeForBackend(SymbolHandle sym_handle,
common::HandleInferStorageTypeError(num_forward_inputs, indexed_graph,
g.GetAttr<StorageTypeVector>("storage_type"));
}
+ // set args/aux as attributes on graph so that subgraph property can use
them
+ std::vector<std::string> arg_names =
sym->ListInputNames(nnvm::Symbol::kReadOnlyArgs);
+ g.attrs["in_args"] = std::make_shared<nnvm::any>(in_args_ptr);
+ g.attrs["in_arg_names"] = std::make_shared<nnvm::any>(arg_names);
+
+ std::vector<std::string> aux_names =
sym->ListInputNames(nnvm::Symbol::kAuxiliaryStates);
+ g.attrs["in_aux"] = std::make_shared<nnvm::any>(in_aux_ptr);
+ g.attrs["in_aux_names"] = std::make_shared<nnvm::any>(aux_names);
+ } else {
+ // args/aux were not specified, so set nullptr/empty-lists
+ NDArray **in_args_ptr = static_cast<NDArray**>(nullptr);
+ std::vector<std::string> arg_names;
+ g.attrs["in_args"] = std::make_shared<nnvm::any>(in_args_ptr);
+ g.attrs["in_arg_names"] = std::make_shared<nnvm::any>(arg_names);
+
+ NDArray **in_aux_ptr = static_cast<NDArray**>(nullptr);
+ std::vector<std::string> aux_names;
+ g.attrs["in_aux"] = std::make_shared<nnvm::any>(in_aux_ptr);
+ g.attrs["in_aux_names"] = std::make_shared<nnvm::any>(aux_names);
}
+ // create a data structure from pointer array
std::vector<std::pair<std::string, std::string>> options_map;
- for (mx_uint i = 0; i < num_options; ++i) {
+ for (mx_uint i = 0; i < num_options; ++i)
options_map.emplace_back(keys[i], vals[i]);
- }
+
const auto backend =
mxnet::op::SubgraphBackendRegistry::Get()->GetSubgraphBackend(backend_name);
const auto& subgraph_prop_list = backend->GetSubgraphProperties();
for (auto property : subgraph_prop_list) {
diff --git a/src/operator/subgraph/build_subgraph.cc
b/src/operator/subgraph/build_subgraph.cc
index dc0c142..413395c 100644
--- a/src/operator/subgraph/build_subgraph.cc
+++ b/src/operator/subgraph/build_subgraph.cc
@@ -560,11 +560,7 @@ void CutGraphInputs(const std::vector<nnvm::NodeEntry*>
&input_entries,
}
nnvm::ObjectPtr n = nnvm::CreateVariableNode(
var_name + std::to_string(name_count_map[var_name]));
- // set attribute for subgraph input to indicate if it is from an arg/param
to model
- if (e->node->is_variable())
- n->attrs.dict["isArg"] = "True";
- else
- n->attrs.dict["isArg"] = "False";
+
*e = nnvm::NodeEntry{n, 0, 0};
}
}
@@ -583,7 +579,7 @@ void ReattachGraphInputs(const
std::vector<nnvm::NodeEntry*> &input_entries,
}
/*!
- * \brief Replace a set of nodes belonging to the same subgraph with a
subgrpah node
+ * \brief Replace a set of nodes belonging to the same subgraph with a
subgraph node
* and keep the subgraph in the subgraph node.
*/
void CreateSubgraphNode(nnvm::Graph* g,
@@ -613,6 +609,7 @@ void CreateSubgraphNode(nnvm::Graph* g,
sym.outputs[i] = *output_entries[i];
}
const SubgraphPropertyPtr& subg_prop =
g->GetAttr<SubgraphPropertyPtr>("subgraph_property");
+ subg_prop->InitSubgraphInputs(&input_entries, &orig_input_entries);
nnvm::ObjectPtr n = subg_prop->CreateSubgraphNode(sym, subgraph_selector,
subgraph_id);
// CreateSubgraphNode returns NULL if subgraph property determines that
subgraph is sub-optimal
// In that case, subgraph node is not created and graph is not modified
diff --git a/src/operator/subgraph/partitioner/custom_subgraph_property.h
b/src/operator/subgraph/partitioner/custom_subgraph_property.h
index 410d983..b7f2cc2 100644
--- a/src/operator/subgraph/partitioner/custom_subgraph_property.h
+++ b/src/operator/subgraph/partitioner/custom_subgraph_property.h
@@ -33,6 +33,7 @@
#include <string>
#include <utility>
#include <vector>
+#include <map>
#include "../common.h"
#include "../subgraph_property.h"
#include "../../include/mxnet/lib_api.h"
@@ -99,6 +100,75 @@ class CustomSubgraphProperty: public SubgraphProperty {
const std::vector<std::pair<std::string, std::string>>& options_map) {
// clear supported_nodes to remove state from previous calls
supported_nodes.clear();
+ // get input args and arg names
+ in_arg_names = g.GetAttr<std::vector<std::string>>("in_arg_names");
+ in_args_ptr = g.GetAttr<NDArray**>("in_args");
+ in_aux_names = g.GetAttr<std::vector<std::string>>("in_aux_names");
+ in_aux_ptr = g.GetAttr<NDArray**>("in_aux");
+
+ // convert input args
+ arg_names.clear();
+ arg_data.clear();
+ arg_shapes.clear();
+ arg_dims.clear();
+ arg_types.clear();
+ arg_verIDs.clear();
+ arg_dev_type.clear();
+ arg_dev_id.clear();
+ for (size_t i=0; i < in_arg_names.size(); i++) {
+ arg_names.push_back(in_arg_names[i].c_str());
+ const NDArray &in_arg = *(in_args_ptr[i]);
+
+#if MXNET_USE_MKLDNN == 1
+ // reorder data if in MKLDNN format
+ if (in_arg.IsMKLDNNData()) {
+ in_arg.Reorder2DefaultAsync();
+ in_arg.WaitToRead();
+ }
+#endif
+
+ // pull out parts of NDArray to send to backend
+ arg_data.push_back(in_arg.data().dptr_);
+ arg_shapes.push_back(in_arg.shape().data());
+ arg_dims.push_back(in_arg.shape().ndim());
+ arg_types.push_back(in_arg.dtype());
+ arg_verIDs.push_back(in_arg.version());
+ const char* arg_ctx_str = in_arg.ctx().dev_mask() == Context::kCPU ?
"cpu" : "gpu";
+ arg_dev_type.push_back(arg_ctx_str);
+ arg_dev_id.push_back(in_arg.ctx().real_dev_id());
+ }
+
+ // convert input aux
+ aux_names.clear();
+ aux_data.clear();
+ aux_shapes.clear();
+ aux_dims.clear();
+ aux_types.clear();
+ aux_verIDs.clear();
+ aux_dev_type.clear();
+ aux_dev_id.clear();
+ for (size_t i=0; i < in_aux_names.size(); i++) {
+ aux_names.push_back(in_aux_names[i].c_str());
+ const auto &in_aux = *(in_aux_ptr[i]);
+
+#if MXNET_USE_MKLDNN == 1
+ // reorder data if in MKLDNN format
+ if (in_aux.IsMKLDNNData()) {
+ in_aux.Reorder2DefaultAsync();
+ in_aux.WaitToRead();
+ }
+#endif
+
+ // pull out parts of NDArray to send to backend
+ aux_data.push_back(in_aux.data().dptr_);
+ aux_shapes.push_back(in_aux.shape().data());
+ aux_dims.push_back(in_aux.shape().ndim());
+ aux_types.push_back(in_aux.dtype());
+ aux_verIDs.push_back(in_aux.version());
+ const char* aux_ctx_str = in_aux.ctx().dev_mask() == Context::kCPU ?
"cpu" : "gpu";
+ aux_dev_type.push_back(aux_ctx_str);
+ aux_dev_id.push_back(in_aux.ctx().real_dev_id());
+ }
// remove all graph attrs, some cannot be saved to json
nnvm::Graph graph = std::move(g);
@@ -108,23 +178,37 @@ class CustomSubgraphProperty: public SubgraphProperty {
// set shape attrs for each node in the graph
if (g.HasAttr("shape")) {
mxnet::ShapeVector shapes = g.GetAttr<mxnet::ShapeVector>("shape");
- for (unsigned i = 0; i < indexed_graph.num_nodes(); i++) {
- nnvm::Node* node = const_cast<nnvm::Node*>(indexed_graph[i].source);
- mxnet::TShape shape = shapes[i];
+ for (unsigned nid = 0; nid < indexed_graph.num_nodes(); nid++) {
+ nnvm::Node* node = const_cast<nnvm::Node*>(indexed_graph[nid].source);
std::stringstream ss;
- ss << shape;
- node->attrs.dict["shape"] = ss.str();
+ ss << "[";
+ // set the output shapes for this node
+ for (unsigned oid = 0; oid < node->num_outputs(); oid++) {
+ const uint32_t out_entry_id = indexed_graph.entry_id(nid, oid);
+ mxnet::TShape& shape = shapes[out_entry_id];
+ ss << shape;
+ if (oid < node->num_outputs()-1) ss << ",";
+ }
+ ss << "]";
+ node->attrs.dict[MX_STR_SHAPE] = ss.str();
}
}
// set dtype attrs for each node in the graph
if (g.HasAttr("dtype")) {
std::vector<int> dtypes = g.GetAttr<std::vector<int> >("dtype");
- for (unsigned i = 0; i < indexed_graph.num_nodes(); i++) {
- nnvm::Node* node = const_cast<nnvm::Node*>(indexed_graph[i].source);
- int dtype = dtypes[i];
+ for (unsigned nid = 0; nid < indexed_graph.num_nodes(); nid++) {
+ nnvm::Node* node = const_cast<nnvm::Node*>(indexed_graph[nid].source);
std::stringstream ss;
- ss << dtype;
- node->attrs.dict["dtype"] = ss.str();
+ ss << "[";
+ // set the output dtypes for this node
+ for (unsigned oid = 0; oid < node->num_outputs(); oid++) {
+ const uint32_t out_entry_id = indexed_graph.entry_id(nid, oid);
+ int dtype = dtypes[out_entry_id];
+ ss << dtype;
+ if (oid < node->num_outputs()-1) ss << ",";
+ }
+ ss << "]";
+ node->attrs.dict[MX_STR_DTYPE] = ss.str();
}
}
@@ -142,10 +226,14 @@ class CustomSubgraphProperty: public SubgraphProperty {
opt_keys_.clear();
opt_vals_.clear();
options_map_.clear();
- for (auto kv : options_map) {
+ // store options in map in subgraph property to re-use later for
reviewSubgraph
+ for (auto& kv : options_map) {
options_map_.push_back(kv);
- opt_keys_.push_back(options_map_.back().first.c_str());
- opt_vals_.push_back(options_map_.back().second.c_str());
+ }
+ // convert options_map_ to char* to pass to backend library
+ for (auto& kv : options_map_) {
+ opt_keys_.push_back(kv.first.c_str());
+ opt_vals_.push_back(kv.second.c_str());
}
CHECK(call_supported_ops_(supported_ops_, json, supported_node_IDs.size(),
ids,
@@ -162,9 +250,10 @@ class CustomSubgraphProperty: public SubgraphProperty {
}
// override CreateSubgraphNode
virtual nnvm::ObjectPtr CreateSubgraphNode(const nnvm::Symbol &sym,
- const int subgraph_id = 0) const {
+ const int subgraph_id = 0) const {
int accept = 1;
int num_attr = 0;
+ std::map<std::string, std::string> user_attrs;
char** attr_keys = nullptr;
char** attr_vals = nullptr;
if (review_subgraph_) {
@@ -173,8 +262,9 @@ class CustomSubgraphProperty: public SubgraphProperty {
const auto& idx = g.indexed_graph();
// set isArg/isAux for each null op/param in the graph
- const std::vector<std::string> aux_names =
sym.ListInputNames(nnvm::Symbol::kAuxiliaryStates);
- std::unordered_set<std::string> aux_set(aux_names.begin(),
aux_names.end());
+ const std::vector<std::string> aux_state_names =
+ sym.ListInputNames(nnvm::Symbol::kAuxiliaryStates);
+ std::unordered_set<std::string> aux_set(aux_state_names.begin(),
aux_state_names.end());
for (unsigned i = 0; i < idx.num_nodes(); i++) {
nnvm::Node* node = const_cast<nnvm::Node*>(idx[i].source);
// check if this node is input to subgraph
@@ -188,31 +278,121 @@ class CustomSubgraphProperty: public SubgraphProperty {
}
std::string subgraph_json = nnvm::pass::SaveJSON(g);
- CHECK(call_review_subgraph_(review_subgraph_, subgraph_json.c_str(),
- subgraph_id, &accept, opt_keys_.data(),
- opt_vals_.data(), opt_keys_.size(),
- &attr_keys, &attr_vals, &num_attr))
+ CHECK(call_review_subgraph_(review_subgraph_, subgraph_json.c_str(),
subgraph_id,
+ &accept, opt_keys_.data(), opt_vals_.data(),
+ opt_keys_.size(), &attr_keys, &attr_vals,
&num_attr,
+ arg_names.data(), arg_names.size(),
arg_data.data(),
+ arg_shapes.data(), arg_dims.data(),
arg_types.data(),
+ arg_verIDs.data(), arg_dev_type.data(),
+ arg_dev_id.data(), aux_names.data(),
aux_names.size(),
+ aux_data.data(), aux_shapes.data(),
aux_dims.data(),
+ aux_types.data(), aux_verIDs.data(),
+ aux_dev_type.data(), aux_dev_id.data()))
<< "Error calling review_subgraph for '" << subgraph_prop << "'";
+
+ if (num_attr > 0) {
+ // set user specified attributes
+ for (int i=0; i < num_attr; i++) {
+ user_attrs[attr_keys[i]] = attr_vals[i];
+ call_free_(attr_vals[i]);
+ call_free_(attr_keys[i]);
+ }
+ // free memory used by custom op to allocate attributes
+ call_free_(attr_vals);
+ call_free_(attr_keys);
+ }
}
+
if (accept) {
nnvm::ObjectPtr n = nnvm::Node::Create();
n->attrs.op = Op::Get(subgraph_op_name);
n->attrs.name = "_op" + std::to_string(subgraph_id);
n->attrs.subgraphs.push_back(std::make_shared<nnvm::Symbol>(sym));
- // set user specified attributes
- for (int i=0; i < num_attr; i++) {
- n->attrs.dict[attr_keys[i]] = attr_vals[i];
- call_free_(attr_vals[i]);
- call_free_(attr_keys[i]);
+
+ // set shapes
+ {
+ std::stringstream ss;
+ ss << "[";
+ for (unsigned i=0; i < sym.outputs.size(); i++) {
+ const nnvm::NodeEntry& e = sym.outputs[i];
+ if (e.node->attrs.dict.count("__shape__") > 0) {
+ std::string& shape = e.node->attrs.dict["__shape__"];
+ // add this shape to the list
+ ss << getShapeAt(shape, e.index);
+ }
+ if (i < sym.outputs.size()-1)
+ ss << ",";
+ }
+ ss << "]";
+ n->attrs.dict["__shape__"] = ss.str();
+ }
+ // set dtypes
+ {
+ std::stringstream ss;
+ ss << "[";
+ for (unsigned i=0; i < sym.outputs.size(); i++) {
+ const nnvm::NodeEntry& e = sym.outputs[i];
+ if (e.node->attrs.dict.count("__dtype__") > 0) {
+ std::string& dtype = e.node->attrs.dict["__dtype__"];
+ // add this dtype to the list
+ ss << getDtypeAt(dtype, e.index);
+ }
+ if (i < sym.outputs.size()-1)
+ ss << ",";
+ }
+ ss << "]";
+ n->attrs.dict["__dtype__"] = ss.str();
}
- // free memory used by custom op to allocate attributes
- call_free_(attr_vals);
- call_free_(attr_keys);
+ // set user specified attributes
+ for (auto attr : user_attrs)
+ n->attrs.dict[attr.first] = attr.second;
return n;
} else {
return nullptr;
}
}
+
+ virtual void InitSubgraphInputs(std::vector<nnvm::NodeEntry*>* input_entries,
+ std::vector<nnvm::NodeEntry>*
orig_input_entries) const {
+ for (size_t i = 0; i < input_entries->size(); ++i) {
+ nnvm::NodeEntry *e = input_entries->at(i);
+ nnvm::NodeEntry& orig = orig_input_entries->at(i);
+
+ // set attribute for subgraph input to indicate if it is from an
arg/param to model
+ if (orig.node->is_variable()) {
+ // get name of original output entry
+ nnvm::Symbol sym;
+ sym.outputs.push_back(orig);
+ const auto output_names = sym.ListOutputNames();
+ CHECK_EQ(output_names.size(), 1U);
+ const std::string& var_name = output_names[0];
+
+ e->node->attrs.dict["isArg"] = "True";
+ e->node->attrs.dict["argName"] = var_name;
+ } else {
+ e->node->attrs.dict["isArg"] = "False";
+ }
+
+ // pass down other attributes if available
+ if (orig.node->attrs.dict.count("__dtype__") > 0) {
+ // get dtype string from other node
+ std::string& dtype = orig.node->attrs.dict["__dtype__"];
+ std::stringstream ss;
+ ss << "[" << getDtypeAt(dtype, orig.index) << "]";
+ e->node->attrs.dict["__dtype__"] = ss.str();
+ }
+
+ if (orig.node->attrs.dict.count("__shape__") > 0) {
+ // get shape string from other node
+ std::string& shape = orig.node->attrs.dict["__shape__"];
+ // create new shape string for this node
+ std::stringstream ss;
+ ss << "[" << getShapeAt(shape, orig.index) << "]";
+ e->node->attrs.dict["__shape__"] = ss.str();
+ }
+ }
+ }
+
// override CreateSubgraphSelector
virtual SubgraphSelectorPtr CreateSubgraphSelector() const {
return std::make_shared<CustomContainOpSelector>(supported_nodes);
@@ -228,6 +408,17 @@ class CustomSubgraphProperty: public SubgraphProperty {
std::string subgraph_op_name;
std::vector<std::pair<std::string, std::string>> options_map_;
std::vector<const char*> opt_keys_, opt_vals_;
+ std::vector<std::string> in_arg_names, in_aux_names;
+ NDArray **in_args_ptr;
+ NDArray **in_aux_ptr;
+ std::vector<const char*> arg_names, aux_names;
+ std::vector<void*> arg_data, aux_data;
+ std::vector<const int64_t*> arg_shapes, aux_shapes;
+ std::vector<int> arg_dims, aux_dims;
+ std::vector<int> arg_types, aux_types;
+ std::vector<size_t> arg_verIDs, aux_verIDs;
+ std::vector<const char*> arg_dev_type, aux_dev_type;
+ std::vector<int> arg_dev_id, aux_dev_id;
};
} // namespace op
} // namespace mxnet
diff --git a/src/operator/subgraph/subgraph_property.h
b/src/operator/subgraph/subgraph_property.h
index e68fc68..5e87626 100644
--- a/src/operator/subgraph/subgraph_property.h
+++ b/src/operator/subgraph/subgraph_property.h
@@ -359,6 +359,14 @@ class SubgraphProperty {
subgraph_node->inputs = *orig_input_entries;
}
/*!
+ * \brief Initialize subgraph internal inputs with external input entries.
+ * Called before CreateSubgraphNode, optional
+ * \param input_entries input entries inside subgraph
+ * \param orig_input_entries input entries outside subgraph
+ */
+ virtual void InitSubgraphInputs(std::vector<nnvm::NodeEntry*>* input_entries,
+ std::vector<nnvm::NodeEntry>*
orig_input_entries) const {}
+ /*!
* \brief Set an attr with name in the attr map.
*/
template <typename T>
diff --git a/tests/python/unittest/test_extensions.py
b/tests/python/unittest/test_extensions.py
index 799615b..d00f149 100644
--- a/tests/python/unittest/test_extensions.py
+++ b/tests/python/unittest/test_extensions.py
@@ -167,3 +167,17 @@ def test_subgraph():
out4 = sym_block(mx.nd.ones((3,2)),mx.nd.ones((3,2)))
# check that result matches one executed by MXNet
assert_almost_equal(out[0].asnumpy(), out4[0].asnumpy(), rtol=1e-3,
atol=1e-3)
+
+ # Gluon Hybridize partitioning with shapes/types
+ sym_block2 = nn.SymbolBlock(sym, [a,b])
+ sym_block2.initialize()
+ a_data = mx.nd.ones((3,2))
+ b_data = mx.nd.ones((3,2))
+ sym_block2.optimize_for(a_data, b_data, backend='myProp')
+ sym_block2.export('optimized')
+ sym_block3 = nn.SymbolBlock.imports('optimized-symbol.json',['a','b'],
+ 'optimized-0000.params')
+
+ out5 = sym_block3(a_data, b_data)
+ # check that result matches one executed by MXNet
+ assert_almost_equal(out[0].asnumpy(), out5[0].asnumpy(), rtol=1e-3,
atol=1e-3)
diff --git a/tests/python/unittest/test_subgraph_op.py
b/tests/python/unittest/test_subgraph_op.py
index f1572e7..e414a98 100644
--- a/tests/python/unittest/test_subgraph_op.py
+++ b/tests/python/unittest/test_subgraph_op.py
@@ -282,7 +282,7 @@ def check_subgraph_exe6(sym, subgraph_backend, op_names):
# infer shape/type before partition before simple_bind
check_call(_LIB.MXSetSubgraphPropertyOpNamesV2(c_str(subgraph_backend),
mx_uint(len(op_names)),
c_str_array(op_names)))
- part_sym = sym.optimize_for(subgraph_backend, exe1.arg_dict)
+ part_sym = sym.optimize_for(subgraph_backend, exe1.arg_dict, exe1.aux_dict)
check_call(_LIB.MXRemoveSubgraphPropertyOpNamesV2(c_str(subgraph_backend)))
exe2 = part_sym.simple_bind(ctx=mx.current_context(), grad_req='null')
@@ -335,7 +335,7 @@ def check_subgraph_exe8(sym, subgraph_backend, op_names):
# infer shape/type before partition before bind
check_call(_LIB.MXSetSubgraphPropertyOpNamesV2(c_str(subgraph_backend),
mx_uint(len(op_names)),
c_str_array(op_names)))
- part_sym = sym.optimize_for(subgraph_backend, arg_array)
+ part_sym = sym.optimize_for(subgraph_backend, arg_array, aux_array)
check_call(_LIB.MXRemoveSubgraphPropertyOpNamesV2(c_str(subgraph_backend)))
exe2 = part_sym.bind(ctx=mx.current_context(), args=arg_array,
aux_states=aux_array, grad_req='null')