tunalloc opened a new pull request, #21141:
URL: https://github.com/apache/incubator-mxnet/pull/21141

   ## Description ##
   Avoid a potential uninitialized read when using the C Predict API with 
symbols where the same input name occurs on more than one node.  Other parts of 
MXNet appear to treat this as an error, so this change makes the C API reject 
it with an error too.
   
   Since the C Predict API has been removed in MXNet v2 this PR is for merging 
into the 1.x branch.
   
   ## Checklist ##
   ### Essentials ###
   - [X] PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], 
[FEATURE], [DOC], etc)
   - [X] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage
   - [ ] Code is well-documented
   
   I could not find any existing tests for the C Predict API.  Please let me 
know if there's documentation that should be updated.
   
   ### Changes ###
   - [X] Reject symbols where a name occurs more than once in the C Predict API
   
   ## Comments ##
   It's possible to cause an uninitialized read in the C Predict API.  If I 
construct a symbol and parameters file using the Python API as follows:
   
   ```
   import mxnet
   net = mxnet.symbol.Variable('a') + mxnet.symbol.Variable('a')
   mxnet.nd.save('foo-params', {})
   net.save('foo-symbol.json')
   ```
   
   Then substitute the contents of the generated files into this C++ code.  
This causes an uninitialized read and fails the final assertion.
   
   ```
   #include <mxnet/c_predict_api.h>
   #include <string>
   #include <cassert>
   
   namespace {
   
   const char* SYMBOL_JSON = 
"{\"nodes\":[{\"op\":\"null\",\"name\":\"a\",\"inputs\":[]},{\"op\":\"null\",\"name\":\"a\",\"inputs\":[]},{\"op\":\"elemwise_add\",\"name\":\"_plus0\",\"inputs\":[[0,0,0],[1,0,0]]}],\"arg_nodes\":[0,1],\"node_row_ptr\":[0,1,2,3],\"heads\":[[2,0,0]],\"attrs\":{\"mxnet_version\":[\"int\",10901]}}";
 
   const char* PARAMS_BYTES = 
"\x12\x01\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0\x0";
   const std::size_t PARAMS_SIZE = 32;
   
   const char* INPUT_KEYS[] = { "a" };
   const std::size_t INPUT_KEYS_SIZE = 1;
   
   const uint32_t INPUT_SHAPE_INDPTR[] = { 0, 1 };
   const uint32_t INPUT_SHAPE_DATA[] = { 1 };
   
   }
   
   int main(int, char**) {
     PredictorHandle handle = nullptr;  
   
     int res = MXPredCreate(
       SYMBOL_JSON,
       static_cast<const void*>(PARAMS_BYTES),
       PARAMS_SIZE,
       1,
       0,
       INPUT_KEYS_SIZE,
       INPUT_KEYS,
       INPUT_SHAPE_INDPTR,
       INPUT_SHAPE_DATA,
       &handle);
     assert(res == 0);
   
     static const float INPUT_DATA[] = { 5. };
     res = MXPredSetInput(handle, "a", INPUT_DATA, 1);
     assert(res == 0);
   
     res = MXPredForward(handle);
     assert(res == 0);
   
     float output = 0.0;
     res = MXPredGetOutput(handle, 0, &output, 1);
     assert(res == 0);
   
     assert(output == 10.0);
     
     return 0;
   }
   ```
   
   If instead I generate the symbol as follows the test passes:
   
   ```
   import mxnet
   var = mxnet.symbol.Variable('a')
   net =  var + var
   mxnet.nd.save('foo-params', {})
   net.save('foo-symbol.json')
   ```
   
   I believe the API should reject symbols where an input name appears more 
than once.  This seems to happen in other parts of the code.  For example the 
following code fails with an assertion error:
   
   ```
   import mxnet
   net = mxnet.symbol.Variable('a') + mxnet.symbol.Variable('a')
   out = net.eval(ctx = mxnet.cpu(), a = mxnet.nd.array([[5.]]))
   ```
   
   This change makes the C Predict API reject such symbols.  With this change 
the above C++ code fails on the first assertion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to