stereomatchingkiss opened a new issue #16983: C and cpp api of mxnet cannot 
load the yolov3 model of gluoncv
URL: https://github.com/apache/incubator-mxnet/issues/16983
 
 
   ## Description
   C and cpp api of mxnet cannot load the yolov3 model of gluoncv, same codes 
works on windows 10 64bits home edition.
   
   ### Error Message
   load by cpp api give error messages
   
       Segmentation fault: 11
   
       Stack trace:
       [bt] (0) 
/home/yyyy/Qt/3rdLibs/mxnet/build/install/lib/libmxnet.so(+0x1399959) 
[0x7f7019bc2959]
       [bt] (1) /lib/x86_64-linux-gnu/libc.so.6(+0x3ef20) [0x7f7016e37f20]
       [bt] (2) [0x562f376fce90]
       terminate called after throwing an instance of ‘dmlc::Error’
       what(): driver shutting down
       terminate called recursively
   
   Load by c api give error messages:
   
       Cannot initialize model, status is:-1
   
   Do not has stack trace :(
   
   ## To Reproduce
   
   1. Install cuda10.1 and cudnn as [this 
post](https://medium.com/@exesse/cuda-10-1-installation-on-ubuntu-18-04-lts-d04f89287130)
 show
   
   2. git clone --recursive https://github.com/apache/incubator-mxnet mxnet
   3. git checkout 5fb29167a1a66480864486bf59c6b4e980ce7daa
   4. sudo apt-get install libopenblas-dev
   5. configure the make file by cmake-gui
   
   
![img0](https://aws1.discourse-cdn.com/standard17/uploads/pyimagesearch/optimized/2X/a/a7edfa62e1e4825366160d3e4a1378c47fb49b98_2_690x388.jpeg)
   
![img1](https://aws1.discourse-cdn.com/standard17/uploads/pyimagesearch/original/2X/5/58fda94c2886937dc2dc2b44e1b00af136d898ca.jpeg)
   
![img2](https://aws1.discourse-cdn.com/standard17/uploads/pyimagesearch/original/2X/6/656bb555ecfbf9a639bf337a872428b4a329f559.jpeg)
   
   6. generate make file
   7. make -j4
   8. sudo make install
   9. Convert yolov3 models of gluoncv to something loadable for c/c++ api on 
cpu and gpu
   
   ```
   import gluoncv as gcv
   from gluoncv.utils import export_block
   from mxnet.contrib import onnx as onnx_mxnet
   
   net = gcv.model_zoo.get_model(‘yolo3_darknet53_coco’, pretrained=True)
   net.export()
   export_block(‘yolo3_darknet53_coco’, net)
   
   ```
   
   ### Steps to reproduce
   1. Load the yolov3 model after converted by cpp api on cpu and gpu
   
   ```
   void load_check_point(std::string const &model_params,
                             std::string const &model_symbol,
                             Symbol *symbol,
                             std::map<std::string, NDArray> *arg_params,
                             std::map<std::string, NDArray> *aux_params,
                             Context const &ctx)
       {            
           std::map<std::string, NDArray> const params = 
NDArray::LoadToMap(model_params);
           std::map<std::string, NDArray> args;
           std::map<std::string, NDArray> auxs;
           for (auto const &iter : params) {
               std::string const type = iter.first.substr(0, 4);
               std::string const name = iter.first.substr(4);
               if (type == "arg:")
                   args[name] = iter.second.Copy(ctx);
               else if (type == "aux:")
                   auxs[name] = iter.second.Copy(ctx);
               else
                   std::cout<<"wrong type"<<std::endl;
           }
    
           *symbol = Symbol::Load(model_symbol);
           *arg_params = args;
           *aux_params = auxs;
    
           //WaitAll is needed when we copy data between GPU and the main memory
           NDArray::WaitAll();
       }
    
       std::unique_ptr<Executor> create_executor(const std::string 
&model_params,
                                                 const std::string 
&model_symbols,
                                                 const Context &context,
                                                 const Shape &input_shape)
       {    
           using namespace std;
    
           Symbol net;
           std::map<std::string, NDArray> args, auxs;
    
           load_check_point(model_params, model_symbols, &net, &args, &auxs, 
context);
           //The shape of the input data must be the same, if you need 
different size,
           //you could rebind the Executor or create a pool of Executor.
           //In order to create input layer of the Executor, I make a dummy 
NDArray.
           //The value of the "data" could be change later    
           args["data"] = NDArray(input_shape, context, false);
           if(input_shape[0] > 1){
               args["data1"] = NDArray(Shape(1), context, false);
           }
           args["softmax_label"] = NDArray(Shape(), context, false);
    
           return std::make_unique<Executor>(net.SimpleBind(context, args, {}, 
{}, auxs));
       }
    
       void load_yolo_object_detector_by_cpp_api()
       {
           std::string const 
model_folder("/home/yyyy/Qt/most_similar_images/similar_pose_php_wrapper/models");
           create_executor(model_folder + "/yolo3_darknet53_coco.params",
                           model_folder + "/yolo3_darknet53_coco.json",
                           mxnet::cpp::Context::gpu(0),
                           mxnet::cpp::Shape(1, 224, 224, 3));
       }
   ```
   
   2. Load by c api
   
   ```
       const char* input_keys[]={"data"};
       const mx_uint input_shape_indptr[]={0 , 4};
       const mx_uint input_shape_data[]={1, 3, 224, 224};
    
       std::string const 
model_folder("/home/yyyy/Qt/most_similar_images/similar_pose_php_wrapper/models");
       auto const symbols = read_all_bytes(model_folder + 
"/yolo3_darknet53_coco.params");
       auto const strparams = read_all_bytes(model_folder + 
"/yolo3_darknet53_coco.json");
    
       // Create predictor
       return  MXPredCreate(symbols.c_str(),
                            strparams.c_str(),
                            static_cast<int>(strparams.size()),
                            2,//device type
                            0,//device num
                            static_cast<mx_uint>(1),
               input_keys,
               input_shape_indptr,
               input_shape_data,
               &predictor_);
   ```
   3. [Links of the full source 
codes](https://drive.google.com/open?id=1yTYax6z2xyah4aamAcHkV9J6Gbv2PNAj)
   
   ## What have you tried to solve it?
   
   1. Try to load another models(full-resnet-152), it worked for c++ and c api, 
on cpu and gpu
   
   ## Environment
   
   > ----------Python Info----------
   > ('Version      :', '2.7.15+')
   > ('Compiler     :', 'GCC 7.4.0')
   > ('Build        :', ('default', 'Oct  7 2019 17:39:04'))
   > ('Arch         :', ('64bit', ''))
   > ------------Pip Info-----------
   > No corresponding pip install for current python.
   > ----------MXNet Info-----------
   > No MXNet installed.
   > ----------System Info----------
   > ('Platform     :', 
'Linux-5.0.0-36-generic-x86_64-with-Ubuntu-18.04-bionic')
   > ('system       :', 'Linux')
   > ('node         :', 'yyyy-H110MH-PRO-D4')
   > ('release      :', '5.0.0-36-generic')
   > ('version      :', '#39~18.04.1-Ubuntu SMP Tue Nov 12 11:09:50 UTC 2019')
   > ----------Hardware Info----------
   > ('machine      :', 'x86_64')
   > ('processor    :', 'x86_64')
   > Architecture:        x86_64
   > CPU op-mode(s):      32-bit, 64-bit
   > Byte Order:          Little Endian
   > CPU(s):              4
   > On-line CPU(s) list: 0-3
   > Thread(s) per core:  1
   > Core(s) per socket:  4
   > Socket(s):           1
   > NUMA node(s):        1
   > Vendor ID:           GenuineIntel
   > CPU family:          6
   > Model:               94
   > Model name:          Intel(R) Core(TM) i5-6400 CPU @ 2.70GHz
   > Stepping:            3
   > CPU MHz:             2700.050
   > CPU max MHz:         3300.0000
   > CPU min MHz:         800.0000
   > BogoMIPS:            5424.00
   > Virtualization:      VT-x
   > L1d cache:           32K
   > L1i cache:           32K
   > L2 cache:            256K
   > L3 cache:            6144K
   > NUMA node0 CPU(s):   0-3
   > Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx 
pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl 
xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 
monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 
x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 
3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp 
tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep 
bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec 
xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp 
md_clear flush_l1d
   > ----------Network Test----------
   > Setting timeout: 10
   > Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0124 sec, LOAD: 
4.3107 sec.
   > Timing for D2L: http://d2l.ai, DNS: 0.2340 sec, LOAD: 0.5400 sec.
   > Timing for FashionMNIST: 
https://repo.mxnet.io/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, 
DNS: 0.0483 sec, LOAD: 0.8083 sec.
   > Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0295 sec, 
LOAD: 0.3983 sec.
   > Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0064 
sec, LOAD: 0.7125 sec.
   > Timing for GluonNLP: http://gluon-nlp.mxnet.io, DNS: 1.0813 sec, LOAD: 
0.3593 sec.
   > Timing for D2L (zh-cn): http://zh.d2l.ai, DNS: 0.0853 sec, LOAD: 0.0673 
sec.
   > Timing for GluonNLP GitHub: https://github.com/dmlc/gluon-nlp, DNS: 0.0004 
sec, LOAD: 1.7169 sec.
   > 
   > 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to