[ 
https://issues.apache.org/jira/browse/SYSTEMML-762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15345582#comment-15345582
 ] 

Niketan Pansare commented on SYSTEMML-762:
------------------------------------------

[~mwdus...@us.ibm.com] Local MR jobs were created for matrix multiplication 
(mapmm) because n couldnot be computed until nrow was executed (Note: conv2d = 
reshape_col(filter %*% im2col(image)))

{code:none}
end = beg + BATCH_SIZE - 1
if(end > num_images) end = num_images
#pulling out the batch
Xb = X[beg:end,]
n = nrow(Xb)
H1_activations = conv2d(Xb, conv_layer1_wts, padding=[2,2], stride=[1,1], 
input_shape=[n,1,img_height,img_width], 
filter_shape=[n1,1,kernel_height,kernel_width])
{code}

If you change the code to following, you will no longer see local MR jobs:
{code:none}
 end = beg + BATCH_SIZE - 1
if(end > num_images) {
         beg = 1
         end = beg + BATCH_SIZE - 1
         #end = num_images
}

#pulling out the batch
Xb = X[beg:end,] # Note: you will be missing the last few records
n = BATCH_SIZE
{code}

I have added a debug information to identify situations like this in the 
commit: 
https://github.com/apache/incubator-systemml/commit/873229f30527c8bfe6dc9399f53fd9f6dbb5b10e

Since we recompile at the loop-level, I am not sure we can fix the former case.

> Fix the bug that causes local MR-Jobs when running in non-singlenode mode on 
> MNIST data for Lenet script
> --------------------------------------------------------------------------------------------------------
>
>                 Key: SYSTEMML-762
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-762
>             Project: SystemML
>          Issue Type: Bug
>            Reporter: Niketan Pansare
>            Assignee: Niketan Pansare
>         Attachments: log.txt, log2.txt, log3.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to