[jira] [Commented] (SYSTEMML-678) MLContext parallelization

Matthias Boehm (JIRA) Tue, 10 May 2016 19:58:29 -0700

    [ 
https://issues.apache.org/jira/browse/SYSTEMML-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15279418#comment-15279418
 ]


Matthias Boehm commented on SYSTEMML-678:
-----------------------------------------

thanks for the question [~johannes.tud]. 

In general, systemml provides for every operation that involves matrices (with 
very few exceptions) both single-node in-memory (CP) and data-parallel 
distributed operators (Spark/MR). If the operation (with pinned inputs/outputs) 
fits into the driver memory budget (70% of driver heap size), we execute this 
operation in single-node CP (depending on the operation, 
multi-threaded/single-threaded); otherwise we compile depending on data/cluster 
characteristics distributed operations. For Spark, operator selection is 
slightly different as we also transitively pull certain operations into 
distributed pipelines if inputs are already distributed. Task-parallel 
computation (with parfor assertion) complements these data-parallel operations, 
and can be arbitrarily combined (e.g., multi-threaded single-node execution, 
concurrent data parallel jobs, distributed task-parallel computation). However, 
except some very specific loop vectorization rewrites, we do not yet 
automatically identify subprograms other than parfor to execute in a 
task-parallel manner. Extended automatic vectorization is certainly an 
interesting direction and we welcome any contributions here. 

Now back to the actual script at hand. Even with parfor, SystemML is currently 
not able to run this loop in a task-parallel manner because there are 
loop-carried dependencies over 'sum'. By specifying the parfor parameter 
'check=0' you disable dependency analysis and it runs but would produce 
undefined results. There are often ways to express slightly differently to 
workaround current shortcomings of the compiler. Feel free to post the problem 
at our dev list: d...@systemml.incubator.apache.org. 

> MLContext parallelization
> -------------------------
>
>                 Key: SYSTEMML-678
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-678
>             Project: SystemML
>          Issue Type: Question
>          Components: Algorithms, Parser, Runtime
>    Affects Versions: SystemML 0.10
>            Reporter: Johannes Wilke
>
> I try to execute script in the MLContext. It is executing, but it dont 
> parallel. For smaller scripts, it works fine. But this script doesnt and it 
> is not clear why. I think it is because of the 4 loop levels, but I am not 
> sure. 
> Is there a documentation what is parallizable and what isnt?
> If I change the main while-loop, i wish to parallize, to a parfor loop it 
> works.
> Here is the script:
> X = read($Xin)
> P = read($Pin)
> #errorMatrix = matrix(0.0,rows=1,cols=1)
> j = 1
> sum = 0
> while (j <=nrow(X) & sum >= 0){ # this should be parallelized 
> #parfor(j in 1: nrow(X),check=0){
>       first = TRUE
>       windows = matrix(0,rows=1,cols=1)
>       offsetPreWindowDefinitions = 0
>       sumWindowLength = 0
>       mastercount = 0
>       totalwindowLength = 0
>       s = 0
>       for(i in 1: nrow(P)){
>               if((as.scalar(P[i,1])*as.scalar(P[i,2]))>totalwindowLength){
>                       totalwindowLength = 
> (as.scalar(P[i,1])*as.scalar(P[i,2]))
>               }
>               s = s+1
>       }
>       lastWindow = matrix(0,rows=sum(P[,1]),cols=1)
>       
>       for(i in 1:nrow(P)){# for every Window-Definition
>               
>               for(k in 1: as.integer(as.scalar(P[i,1]))){# for every pnum
>                       column = 
> matrix(0,rows=as.integer(as.scalar(P[1,4])),cols=1)
>                       for(l in 1: nrow(column)+1){
>                               offsetPreWindowDefinitions = totalwindowLength 
> - (as.scalar(P[i,1])*as.scalar(P[i,2]))
>                               tsindex = ((k-1) * as.scalar(P[i,2])) + l-1 + 
> offsetPreWindowDefinitions
>                               if(l==nrow(column)+1){
>                                       lastWindow[sumWindowLength+k,1] = 
> X[j,tsindex+1]
>                               } else {
>                                       
>                                       column[l,1] = X[j,tsindex+1]
>                               }
>                               mastercount = mastercount +1
>                               #print(mastercount)
>                       }
>                       if(first){
>                               first = FALSE;
>                               windows = column
>                       } else {
>                               windows = cbind(windows,column)
>                       }
>               }
>               
>               sumWindowLength = sumWindowLength + as.scalar(P[i,1])
>       }
>       
>       
>       result = matrix(14.3,rows=as.integer(as.scalar(P[1,4])),cols=1)
>       for(i in 
> totalwindowLength:as.integer(as.scalar(P[1,4]))+totalwindowLength-1){
>               result[i-totalwindowLength+1,1] = X[j,i+1]      
>               s = s+1
>       }
>       params = solve(windows,result)
>       print(j)
>       predict = matrix(0,rows=1, cols=1)
>       for(i in 1:nrow(lastWindow)){
>               predict[1,1] = predict[1,1] + (params[i,1] * lastWindow[i,1])
>               s = s+1
>       }
>       
>       predictscalar = as.scalar(predict[1,1])
>       targetscalar = as.scalar(X[j,ncol(X)])
>       sum = sum + ((targetscalar - predictscalar) * (targetscalar - 
> predictscalar))
>       
>       
>       
>       j = j+1
>       #write(lastWindow, 
> "/media/johannes/Data/Seafile/UNI/Beleg/sysml_output/lWOut.csv", 
> format="csv", header=TRUE, sep=",", sparse=TRUE);
>       #write(windows, 
> "/media/johannes/Data/Seafile/UNI/Beleg/sysml_output/windowsOut.csv", 
> format="csv", header=TRUE, sep=",", sparse=TRUE);
>       #write(result, 
> "/media/johannes/Data/Seafile/UNI/Beleg/sysml_output/resultOut.csv", 
> format="csv", header=TRUE, sep=",", sparse=TRUE);
> }
> print(sum/nrow(X))
> I hope that you can help me!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SYSTEMML-678) MLContext parallelization

Reply via email to