avivna opened a new issue #15557: mxnet.profiler.dump creates invalid json files URL: https://github.com/apache/incubator-mxnet/issues/15557 ## Description I would like to activate the mxnet profiler in order to profile a deep learning training job. Yet, when mxnet.profiler.dumps is activated more than once during the training job (each time dumping the output into a different file) the output json files have invalid json format. Assuming, for example, that 3 files are created during the run using the profiler.dumps command: 2. the first file will miss the following suffix: ], "displayTimeUnit": "ms" } 1. The last file will miss the following suffix: { "traceEvents": [ { "ph": "M", "args": { "name": "cpu/0" }, "pid": 0, "name": "process_name" }, { "ph": "M", "args": { "name": "cpu/1" }, "pid": 1, "name": "process_name" }, { "ph": "M", "args": { "name": "cpu pinned/" }, "pid": 2, "name": "process_name" }, { "ph": "M", "args": { "name": "cpu shared/" }, "pid": 3, "name": "process_name" }, 3. The second file will miss both suffix and prefix described in [1] and [2] ## Environment info (Required) I have downloaded from the mxnet/python repository the following docker image: mxnet/python:1.4.1_cpu_py2 Results of diagnose.py script: ('Version :', '2.7.12') ('Compiler :', 'GCC 5.4.0 20160609') ('Build :', ('default', 'Nov 12 2018 14:36:49')) ('Arch :', ('64bit', 'ELF')) ------------Pip Info----------- ('Version :', '19.1.1') ('Directory :', '/usr/local/lib/python2.7/dist-packages/pip') ----------MXNet Info----------- ('Version :', '1.4.1') ('Directory :', '/usr/local/lib/python2.7/dist-packages/mxnet') Hashtag not found. Not installed from pre-built package. ----------System Info---------- ('Platform :', 'Linux-4.9.125-linuxkit-x86_64-with-Ubuntu-16.04-xenial') ('system :', 'Linux') ('node :', '9178f021bc9a') ('release :', '4.9.125-linuxkit') ('version :', '#1 SMP Fri Sep 7 08:20:28 UTC 2018') ----------Hardware Info---------- ('machine :', 'x86_64') ('processor :', 'x86_64') Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 2 On-line CPU(s) list: 0,1 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 142 Model name: Intel(R) Core(TM) i7-7660U CPU @ 2.50GHz Stepping: 9 CPU MHz: 2500.000 BogoMIPS: 4933.47 Hypervisor vendor: vertical Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 4096K Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht pbe syscall nx pdpe1gb lm constant_tsc rep_good nopl xtopology nonstop_tsc pni pclmulqdq dtes64 ds_cpl ssse3 sdbg fma cx16 xtpr pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch kaiser fsgsbase bmi1 hle avx2 bmi2 erms rtm xsaveopt arat ----------Network Test---------- Setting timeout: 10 Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0900 sec, LOAD: 0.8678 sec. Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0894 sec, LOAD: 1.7616 sec. Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.1251 sec, LOAD: 1.4927 sec. Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0941 sec, LOAD: 0.3832 sec. Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.0892 sec, LOAD: 1.3198 sec. Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.1272 sec, LOAD: 1.1022 sec. Package used (Python/R/Scala/Julia): I'm using Python package ## Build info (Required if built from source) Compiler (gcc/clang/mingw/visual studio): GCC 5.4.0 20160609 MXNet version: ('Version :', '1.4.1') ('Directory :', '/usr/local/lib/python2.7/dist-packages/mxnet') Build config: (Paste the content of config.mk, or the build command.) ## Minimum reproducible example Download the following script and follow the steps described in the next section in order to reproduce this bug: [mxnet_profiler_dumps_invalid_json_example.py.zip](https://github.com/apache/incubator-mxnet/files/3396593/mxnet_profiler_dumps_invalid_json_example.py.zip) ## Steps to reproduce 1. pull docker image mxnet/python:1.4.1_cpu_py2 from repository 2. Activate docker using the following command: docker run -e PYTHONUNBUFFERED=0 -it mxnet/python:1.4.1_cpu_py2 /bin/bash 3. download the attached script: 'mxnet_profiler_dumps_invalid_json_example.py.py.zip' and unzip it. 4. open a new shell, and copy the script from previous step into the docker container: docker cp {download_folder}/mxnet_profiler_dumps_invalid_json_example.py.py {CONTAINER_ID}:/mxnet_profiler_dumps_invalid_json_example.py 5. Go back to the activated docker container, and activate the script using: python mxnet_profiler_dumps_invalid_json_example.py 6. install vim on docker: apt-get update apt-get install vim 7. use vim to view missing prefix/suffix in the following files: /tmp/profiler_dump/profile_iteration_0.json /tmp/profiler_dump/profile_iteration_1.json /tmp/profiler_dump/profile_iteration_2.json ## What have you tried to solve it? 1. When only a single file is dumped, profiler yields valid json format files.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
