[GitHub] [incubator-mxnet] xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long.
xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long. URL: https://github.com/apache/incubator-mxnet/issues/15108#issuecomment-500288680 After I set the minimum value to 0, the test results of the two models are the same. **same image as input, get the output(pre_output1,pre_output2) from stage1_unit1_relu1 layer of the two models.Then test the time of Conv layer, pre_output1 as the input of my model, pre_output2 as the input of original model.** (0.004521, 0.004521) **The pre_output1 as the input of the two models, test the time of Conv layer** (0.004198, 0.004238) **The pre_output2 as the input of the two models, test the time of Conv layer** (0.004196, 0.004184) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long.
xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long. URL: https://github.com/apache/incubator-mxnet/issues/15108#issuecomment-500273584 many thx, but the input data is obtained from the real data through many layers, and the results of all the real pictures are the same as above, so maybe the minimal calculation result is directly equal to 0? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long.
xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long. URL: https://github.com/apache/incubator-mxnet/issues/15108#issuecomment-499014301 I've saved the pre_output1, the pre_output2, the conv1_weight, and the conv2_weight into the param file, and reload them to test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long.
xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long. URL: https://github.com/apache/incubator-mxnet/issues/15108#issuecomment-499012789 @pengzhao-intel Yeah, I've tested it many times. Here is my test file link: https://drive.google.com/open?id=1pzWZFY2cXsqYphit1HpMdGjntfSgIb9w This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long.
xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long. URL: https://github.com/apache/incubator-mxnet/issues/15108#issuecomment-498944552 @pengzhao-intel I think you misunderstood the result. Take a look at the following results, different inputs have a great influence on the operation time of convolution layer, What could be the reason for this result? **same image as input, get the output(pre_output1,pre_output2) from stage1_unit1_relu1 layer of the two models.Then test the time of Conv layer, pre_output1 as the input of my model, pre_output2 as the input of original model.** (my_model: 0.075896, original model: 0.006333) **The pre_output1 as the input of the two models, test the time of Conv layer** (my_model: 0.072311, original model: 0.072548) **The pre_output2 as the input of the two models, test the time of Conv layer** (my_model: 0.0055, original model: 0.005653) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long.
xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long. URL: https://github.com/apache/incubator-mxnet/issues/15108#issuecomment-498564527 @pengzhao-intel Can you find out what the problem is? **(The stage_unit1_relu1 is the front layer of the stage_unit1_conv1 layer)** **===The shape of the input image is (1,3,112,112) warm up 20 times same image as input,test the time from input layer to stage_unit1_relu1 layer** (0.030597, 0.029112) (0.029765, 0.027273) (0.029243, 0.026952) **same image as input,test the time from input layer to stage_unit1_conv1 layer** (0.105118, 0.031303) (0.098103, 0.031876) (0.098488, 0.031119) **same image as input, get the output(pre_output1,pre_output2) from stage1_unit1_relu1 layer of the two models.Then test the time of Conv layer, pre_output1 as the input of my model, pre_output2 as the input of original model.** (0.075896, 0.006333) (0.07515, 0.005862) (0.075065, 0.005814) **The pre_output1 as the input of the two models, test the time of Conv layer** (0.072311, 0.072548) (0.073366, 0.075132) (0.074931, 0.074962) **The pre_output2 as the input of the two models, test the time of Conv layer** (0.0055, 0.005653) (0.005488, 0.005642) (0.005499, 0.005644) ('loading bin', 0) (2, 3, 12, 12) **The shape of the input image is (1,3,12,12) warm up 20 times same image as input,test the time from input layer to stage_unit1_relu1 layer** (0.000733, 0.000735) (0.000723, 0.000767) (0.000761, 0.00073) **same image as input,test the time from input layer to stage_unit1_conv1 layer** (0.003494, 0.000923) (0.003302, 0.000857) (0.003363, 0.000906) **same image as input, get the output(pre_output1,pre_output2) from stage1_unit1_relu1 layer of the two models.Then test the time of Conv layer, pre_output1 as the input of my model, pre_output2 as the input of original model.** (0.002808, 0.000264) (0.002637, 0.000262) (0.002527, 0.000257) **The pre_output1 as the input of the two models, test the time of Conv layer** (0.002551, 0.00255) (0.002455, 0.002451) (0.002425, 0.002443) **The pre_output2 as the input of the two models, test the time of Conv layer** (0.000261, 0.000258) (0.000259, 0.000255) (0.00026, 0.000255) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long.
xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long. URL: https://github.com/apache/incubator-mxnet/issues/15108#issuecomment-498128806 Here, I tested the run time of different network layers on one image, I found that the problem occurred in most of the conv2_weight layers. Tested on CPU, no mkl build, the first col shows my model, the second col shows the original model, the third col shows the network layer name. [all_output_time_contrast.txt](https://github.com/apache/incubator-mxnet/files/3246126/all_output_time_contrast.txt) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long.
xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long. URL: https://github.com/apache/incubator-mxnet/issues/15108#issuecomment-498126043 @pengzhao-intel the mkl build brings great performance to both models, but the issue still exits. By testing the run time of different network layers, I found that the problem occurred in most of the conv2_weight layers. The symbol.json of both models are same, and it's just that the value of arg_params is different, why does it make a huge difference in the running time of the model? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long.
xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long. URL: https://github.com/apache/incubator-mxnet/issues/15108#issuecomment-497634604 Here, I tested the computation time for each output layer as follows: Test on CPU: > **output_name my_mod r100_mod** id_output: 0.00042,0.0005 _minusscalar0_output:0.000355, 0.000364 _mulscalar0_output: 0.000345, 0.000368 conv0_output:0.004286, 0.005467 bn0_output: 0.004573, 0.004223 relu0_output:0.005896, 0.011879 stage1_unit1_bn1_output: 0.009214, 0.012385 stage1_unit1_conv1_output: 0.030702, 0.033954 stage1_unit1_bn2_output: 0.029785, 0.030585 stage1_unit1_relu1_output: 0.031125, 0.027422 stage1_unit1_conv2_output: 0.103826, 0.030677 stage1_unit1_bn3_output: 0.10424,0.030824 stage1_unit1_conv1sc_output:0.007101, 0.017014 stage1_unit1_sc_output: 0.006566, 0.01673 _plus0_output: 0.107347, 0.040394 stage1_unit2_bn1_output: 0.107868, 0.041233 stage1_unit2_conv1_output: 0.118626, 0.065909 stage1_unit2_bn2_output: 0.111833, 0.045059 stage1_unit2_relu1_output: 0.137247, 0.068904 stage1_unit2_conv2_output: 0.191567, 0.064852 stage1_unit2_bn3_output: 0.173118, 0.060588 _plus1_output: 0.156072, 0.04732 stage1_unit3_bn1_output: 0.155305, 0.047213 stage1_unit3_conv1_output: 0.159347, 0.049804 stage1_unit3_bn2_output: 0.164846, 0.066085 stage1_unit3_relu1_output: 0.1657, 0.071379 stage1_unit3_conv2_output: 0.325648, 0.052968 stage1_unit3_bn3_output: 0.37007,0.05749 _plus2_output: 0.33891,0.071809 stage2_unit1_bn1_output: 0.328169, 0.066442 stage2_unit1_conv1_output: 0.322733, 0.056022 stage2_unit1_bn2_output: 0.325429, 0.066875 stage2_unit1_relu1_output: 0.326984, 0.075964 stage2_unit1_conv2_output: 0.490685, 0.06886 stage2_unit1_bn3_output: 0.494744, 0.070616 stage2_unit1_conv1sc_output:0.313479, 0.050747 stage2_unit1_sc_output: 0.313801, 0.051451 _plus3_output: 0.488039, 0.057655 stage2_unit2_bn1_output: 0.560673, 0.078912 stage2_unit2_conv1_output: 0.551832, 0.080624 stage2_unit2_bn2_output: 0.523579, 0.069424 stage2_unit2_relu1_output: 0.498217, 0.07407 stage2_unit2_conv2_output: 0.584102, 0.074921 stage2_unit2_bn3_output: 0.58085,0.074009 _plus4_output: 0.581565, 0.072269 stage2_unit3_bn1_output: 0.587932, 0.079426 stage2_unit3_conv1_output: 0.584251, 0.06408 stage2_unit3_bn2_output: 0.592011, 0.085435 stage2_unit3_relu1_output: 0.58509,0.07392 stage2_unit3_conv2_output: 0.624394, 0.081102 stage2_unit3_bn3_output: 0.627502, 0.078248 _plus5_output: 0.624716, 0.082667 stage2_unit4_bn1_output: 0.62572,0.079881 stage2_unit4_conv1_output: 0.671533, 0.104247 stage2_unit4_bn2_output: 0.664362, 0.090178 stage2_unit4_relu1_output: 0.668143, 0.093252 stage2_unit4_conv2_output: 0.831109, 0.093747 stage2_unit4_bn3_output: 0.794402, 0.091997 _plus6_output: 0.796238, 0.084438 stage2_unit5_bn1_output: 0.803676, 0.087099 stage2_unit5_conv1_output: 0.798894, 0.088101 stage2_unit5_bn2_output: 0.801456, 0.094359 stage2_unit5_relu1_output: 0.799399, 0.085508 stage2_unit5_conv2_output: 0.972368, 0.099089 stage2_unit5_bn3_output: 0.973519, 0.091875 _plus7_output: 0.974544, 0.100297 stage2_unit6_bn1_output: 0.974595, 0.094085 stage2_unit6_conv1_output: 0.975609, 0.104353 stage2_unit6_bn2_output: 0.973079, 0.09231 stage2_unit6_relu1_output: 0.978731, 0.094409 stage2_unit6_conv2_output: 1.151426, 0.095977 stage2_unit6_bn3_output: 1.154868, 0.100843 _plus8_output: 1.152926, 0.106044 stage2_unit7_bn1_output: 1.154156, 0.10229 stage2_unit7_conv1_output: 1.264803, 0.104436 stage2_unit7_bn2_output: 1.152894, 0.09908 stage2_unit7_relu1_output: 1.156383, 0.102306 stage2_unit7_conv2_output: 1.327798, 0.102294 stage2_unit7_bn3_output: 1.329875, 0.099894 _plus9_output: 1.334517, 0.109175 stage2_unit8_bn1_output: 1.331067, 0.115449 stage2_unit8_conv1_output: 1.337357, 0.11694 stage2_unit8_bn2_output: 1.33041,0.107694