[jira] [Commented] (SYSTEMML-762) Fix the bug that causes local MR-Jobs when running in non-singlenode mode on MNIST data for Lenet script
[ https://issues.apache.org/jira/browse/SYSTEMML-762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15348913#comment-15348913 ] Niketan Pansare commented on SYSTEMML-762: -- Yes, this is expected as no metadata is available for CSV. Please note that no MR jobs are created for im2col or ba+*. Also, out of 41.459 seconds of im2col, 33.114 seconds is spent in Cache release (which most likely occurs during validation and/or testing). I am working on optimized conv2d* operators that will help avoid this cost and the improvement should be in soon :) Also, as a related sidenote, we should provide additional converter utils (if necessary) and encourage users to test their deep learning scripts using binary format for more accurate profiling. > Fix the bug that causes local MR-Jobs when running in non-singlenode mode on > MNIST data for Lenet script > > > Key: SYSTEMML-762 > URL: https://issues.apache.org/jira/browse/SYSTEMML-762 > Project: SystemML > Issue Type: Bug >Reporter: Niketan Pansare >Assignee: Niketan Pansare > Attachments: log.txt, log2.txt, log3.txt, log4.txt, log5.txt, > log6.txt, log7.txt > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SYSTEMML-762) Fix the bug that causes local MR-Jobs when running in non-singlenode mode on MNIST data for Lenet script
[ https://issues.apache.org/jira/browse/SYSTEMML-762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15348898#comment-15348898 ] Mike Dusenberry commented on SYSTEMML-762: -- Looks like MR jobs are being created now for the CSV reblocking. > Fix the bug that causes local MR-Jobs when running in non-singlenode mode on > MNIST data for Lenet script > > > Key: SYSTEMML-762 > URL: https://issues.apache.org/jira/browse/SYSTEMML-762 > Project: SystemML > Issue Type: Bug >Reporter: Niketan Pansare >Assignee: Niketan Pansare > Attachments: log.txt, log2.txt, log3.txt, log4.txt, log5.txt, log6.txt > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SYSTEMML-762) Fix the bug that causes local MR-Jobs when running in non-singlenode mode on MNIST data for Lenet script
[ https://issues.apache.org/jira/browse/SYSTEMML-762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345582#comment-15345582 ] Niketan Pansare commented on SYSTEMML-762: -- [~mwdus...@us.ibm.com] Local MR jobs were created for matrix multiplication (mapmm) because n couldnot be computed until nrow was executed (Note: conv2d = reshape_col(filter %*% im2col(image))) {code:none} end = beg + BATCH_SIZE - 1 if(end > num_images) end = num_images #pulling out the batch Xb = X[beg:end,] n = nrow(Xb) H1_activations = conv2d(Xb, conv_layer1_wts, padding=[2,2], stride=[1,1], input_shape=[n,1,img_height,img_width], filter_shape=[n1,1,kernel_height,kernel_width]) {code} If you change the code to following, you will no longer see local MR jobs: {code:none} end = beg + BATCH_SIZE - 1 if(end > num_images) { beg = 1 end = beg + BATCH_SIZE - 1 #end = num_images } #pulling out the batch Xb = X[beg:end,] # Note: you will be missing the last few records n = BATCH_SIZE {code} I have added a debug information to identify situations like this in the commit: https://github.com/apache/incubator-systemml/commit/873229f30527c8bfe6dc9399f53fd9f6dbb5b10e Since we recompile at the loop-level, I am not sure we can fix the former case. > Fix the bug that causes local MR-Jobs when running in non-singlenode mode on > MNIST data for Lenet script > > > Key: SYSTEMML-762 > URL: https://issues.apache.org/jira/browse/SYSTEMML-762 > Project: SystemML > Issue Type: Bug >Reporter: Niketan Pansare >Assignee: Niketan Pansare > Attachments: log.txt, log2.txt, log3.txt > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SYSTEMML-762) Fix the bug that causes local MR-Jobs when running in non-singlenode mode on MNIST data for Lenet script
[ https://issues.apache.org/jira/browse/SYSTEMML-762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15333009#comment-15333009 ] Niketan Pansare commented on SYSTEMML-762: -- Fixed by the commit: https://github.com/apache/incubator-systemml/commit/55c8ee7d6e3c1fcdf5c2583eee3f0a287d4baac9 > Fix the bug that causes local MR-Jobs when running in non-singlenode mode on > MNIST data for Lenet script > > > Key: SYSTEMML-762 > URL: https://issues.apache.org/jira/browse/SYSTEMML-762 > Project: SystemML > Issue Type: Bug >Reporter: Niketan Pansare >Assignee: Niketan Pansare > -- This message was sent by Atlassian JIRA (v6.3.4#6332)