AssassinTee opened a new issue #15891: Timeout on second predict
URL: https://github.com/apache/incubator-mxnet/issues/15891
 
 
   ## Description
   I'm setting up a flask server which loads my mxnet model and has a 
predict-Api-method.
   
   While testing the api I noticed, that the prediction has a timeout on the 
second call in the mxnet api. By timeout I mean that python is stuck in one 
mxnet method and seems to run endless.
   
   I am using python-flask and mxnet v. 1.4.1. I tried upgrading to mxnet v. 
1.5.0 but nothing chaned and the error persists.
   
   I already tried different implementations for the predict method (see 
below), but both timeout. When I switch to a keras backend, everything works 
fine, but I need to use mxnet.
   
   I used this guide for the forward-prediction: 
[https://mxnet.incubator.apache.org/versions/master/tutorials/python/predict_image.html](https://mxnet.incubator.apache.org/versions/master/tutorials/python/predict_image.html)
   
   ## Environment info (Required)
   
   ```
   ----------Python Info----------
   Version      : 3.6.9
   Compiler     : GCC 7.3.0
   Build        : ('default', 'Jul 30 2019 19:07:31')
   Arch         : ('64bit', '')
   ------------Pip Info-----------
   Version      : 19.1.1
   Directory    : 
/home/<removed>/anaconda3/envs/openapi_flask/lib/python3.6/site-packages/pip
   ----------MXNet Info-----------
   Version      : 1.5.0
   Directory    : 
/home/<removed>/anaconda3/envs/openapi_flask/lib/python3.6/site-packages/mxnet
   Commit Hash   : 75a9e187d00a8b7ebc71412a02ed0e3ae489d91f
   Library      : 
['/home/<removed>/anaconda3/envs/openapi_flask/lib/python3.6/site-packages/mxnet/libmxnet.so']
   Build features:
   ✖ CUDA
   ✖ CUDNN
   ✖ NCCL
   ✖ CUDA_RTC
   ✖ TENSORRT
   ✔ CPU_SSE
   ✔ CPU_SSE2
   ✔ CPU_SSE3
   ✔ CPU_SSE4_1
   ✔ CPU_SSE4_2
   ✖ CPU_SSE4A
   ✔ CPU_AVX
   ✖ CPU_AVX2
   ✖ OPENMP
   ✖ SSE
   ✔ F16C
   ✖ JEMALLOC
   ✖ BLAS_OPEN
   ✖ BLAS_ATLAS
   ✖ BLAS_MKL
   ✖ BLAS_APPLE
   ✔ LAPACK
   ✖ MKLDNN
   ✔ OPENCV
   ✖ CAFFE
   ✖ PROFILER
   ✔ DIST_KVSTORE
   ✖ CXX14
   ✖ INT64_TENSOR_SIZE
   ✔ SIGNAL_HANDLER
   ✖ DEBUG
   ----------System Info----------
   Platform     : Linux-4.15.0-52-generic-x86_64-with-debian-buster-sid
   system       : Linux
   node         : marvin-Latitude-5590
   release      : 4.15.0-52-generic
   version      : #56-Ubuntu SMP Tue Jun 4 22:49:08 UTC 2019
   ----------Hardware Info----------
   machine      : x86_64
   processor    : x86_64
   Architektur:                   x86_64
   CPU Operationsmodus:           32-bit, 64-bit
   Byte-Reihenfolge:              Little Endian
   CPU(s):                        8
   Liste der Online-CPU(s):       0-7
   Thread(s) pro Kern:            2
   Kern(e) pro Socket:            4
   Sockel:                        1
   NUMA-Knoten:                   1
   Anbieterkennung:               GenuineIntel
   Prozessorfamilie:              6
   Modell:                        142
   Modellname:                    Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz
   Stepping:                      10
   CPU MHz:                       1197.995
   Maximale Taktfrequenz der CPU: 3600,0000
   Minimale Taktfrequenz der CPU: 400,0000
   BogoMIPS:                      3792.00
   Virtualisierung:               VT-x
   L1d Cache:                     32K
   L1i Cache:                     32K
   L2 Cache:                      256K
   L3 Cache:                      6144K
   NUMA-Knoten0 CPU(s):           0-7
   Markierungen:                  fpu vme de pse tsc msr pae mce cx8 apic sep 
mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe 
syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good 
nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 
monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 
x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 
3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp 
tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep 
bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec 
xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp 
md_clear flush_l1d
   ----------Network Test----------
   Setting timeout: 10
   Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0061 
sec, LOAD: 0.7382 sec.
   Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.0363 sec, LOAD: 
0.8136 sec.
   Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.0551 sec, LOAD: 
0.8109 sec.
   Timing for FashionMNIST: 
https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz,
 DNS: 0.0602 sec, LOAD: 0.7776 sec.
   Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0186 sec, LOAD: 
0.5304 sec.
   Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0203 sec, 
LOAD: 0.1319 sec.
   ```
   
   Package used (Python/R/Scala/Julia):
   I'm using the pip version (1.4.1 or 1.5.0), both didn't work
   
   ## Build info (Required if built from source)
   
   python 3.6 (from command) with flask
   
   ## Error Message:
   Nothing, the program hangs itself.
   
   ## Minimum reproducible example
   ```python
   import mxnet as mx
   from collections import namedtuple
   
   Batch = namedtuple('Batch', ['data'])
   
   class MxnetBackend:
       def __init__(self):
           print("MXNet Version:", mx.__version__)
           self.sym, self.arg_params, self.aux_params = 
mx.model.load_checkpoint(prefix='models/kc_mxnet', epoch=0)
           self.mod = mx.mod.Module(symbol=self.sym, 
                       data_names=['/dense_1_input1'], 
                       context=mx.cpu(), 
                       label_names=None)
           self.mod.bind(for_training=False, 
                   data_shapes=[('/dense_1_input1', (1, 1, 512, 3010))], 
                   label_shapes=self.mod._label_shapes)
           self.mod.set_params(self.arg_params, self.aux_params, 
allow_missing=True)
   
       def predict(self, X):#this timeouts on second call
           """
           gets ndarray
           returns ndarray
           """
           X = mx.nd.array(X)
           self.mod.forward(Batch(X))
           res = self.mod.get_outputs()[0].asnumpy()
           return res
   
       def predict2(self, X):#this timeouts on second call, too
           """
           gets ndarray
           returns ndarray
           """
           return self.mod.predict(X).asnumpy()
   ```
   
   ## Steps to reproduce
   (Paste the commands you ran that produced the error.)
   
   1. Use the backend to predict a label
   2. Predict another label
   3. Wait for something to happen
   
   ## What have you tried to solve it?
   
   1. Use multiple implementations of the predictor (see method predict and 
predict2)
   2. Update mxnet to latest version (from 1.4.1 to 1.5.0)
   3. Ask Stackoverflow
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to