[
https://issues.apache.org/jira/browse/SYSTEMML-843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15405096#comment-15405096
]
Imran Younus commented on SYSTEMML-843:
---------------------------------------
[~nakul02] and I investigated tSNE algorithm further today. After applying the
changes suggested by [~mboehm7], there was huge improvement in the function
{{x2p}}. Today we looked at the performance of {{tsne}} which implements the
gradient descent. It takes more than twice as much time as the python
implementation (820sec vs 390sec). I've update tSNE.dml script in github
(https://github.com/apache/incubator-systemml/pull/200). Here are the stats:
{code}
16/08/02 17:17:16 INFO api.DMLScript: SystemML Statistics:
Total elapsed time: 823.878 sec.
Total compilation time: 0.458 sec.
Total execution time: 823.420 sec.
Number of compiled MR Jobs: 1.
Number of executed MR Jobs: 0.
Cache hits (Mem, WB, FS, HDFS): 1299576/0/0/1.
Cache writes (WB, FS, HDFS): 379333/0/2.
Cache times (ACQr/m, RLS, EXP): 0.400/0.079/0.996/0.036 sec.
HOP DAGs recompiled (PRED, SB): 0/2001.
HOP DAGs recompile time: 1.035 sec.
Functions recompiled: 2.
Functions recompile time: 0.035 sec.
Total JIT compile time: 19.167 sec.
Total JVM GC count: 1951.
Total JVM GC time: 13.888 sec.
Heavy hitter instructions (name, time, count):
-- 1) tsne 823.061 sec 1
-- 2) * 244.509 sec 120782
-- 3) / 173.698 sec 173172
-- 4) + 168.255 sec 164652
-- 5) - 91.713 sec 118780
-- 6) tak+* 53.855 sec 58390
-- 7) tsmm 48.542 sec 2001
-- 8) uark+ 27.817 sec 2000
-- 9) x2p 10.541 sec 1
-- 10) ba+* 6.135 sec 2000
16/08/02 17:17:16 INFO api.DMLScript: END DML run 08/02/2016 17:17:16
{code}
Here is the {{tsne}} function:
{code}
tsne = function(matrix[double] X, int reduced_dims, int initial_dims, int
perplexity)
return(matrix[double] Y, matrix[double] C) {
d = reduced_dims
n = nrow(X)
max_iter = 2000
eta = 500
P = x2p(X, 1.0e-5, 20.0)
P = P*4
Y = rand(rows=n, cols=d, pdf="normal")
C = matrix(0, rows=max_iter, cols=1)
ZERODIAG = (diag(matrix(-1, rows=n, cols=1)) + 1)
for (itr in 1:max_iter) {
D = distance_matrix(Y)
Z = 1/(D + 1)
Z = Z * ZERODIAG
Q = Z/sum(Z)
W = (P - Q)*Z
sumW = rowSums(W)
grad_C = Y * sumW - W %*% Y
Y = Y - eta*grad_C
Y = Y - colMeans(Y)
if (itr%%50 == 0) {
#C[itr,] = sum(P * log(pmax(P, 1e-12) / pmax(Q, 1e-12)))
#print(as.scalar(C[itr,1]))
print(itr)
}
if (itr == 100) {
P = P/4
}
}
}
{code}
> leftIndex and cache release extremely slow
> ------------------------------------------
>
> Key: SYSTEMML-843
> URL: https://issues.apache.org/jira/browse/SYSTEMML-843
> Project: SystemML
> Issue Type: Bug
> Reporter: Imran Younus
> Attachments: tSNT.tar.gz
>
>
> I'm running the tSNE script in standalone mode with a subset of MNIST data
> (2500 points). I ran this with and without `-exec singlenode`. Here are the
> stats:
> (BTW, the same function implemented in python takes less than 10 sec!)
> -> with singlenode flag
> {code}
> ./bin/systemml scripts/staging/tSNE.dml -stats -nvargs
> X=/home/iyounus/workspace/tsne_python/mnist2500_X.txt Y=Y_out.txt C=C_out.txt
> 16/08/01 16:46:54 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time: 109.667 sec.
> Total compilation time: 0.407 sec.
> Total execution time: 109.260 sec.
> Number of compiled MR Jobs: 0.
> Number of executed MR Jobs: 0.
> Cache hits (Mem, WB, FS, HDFS): 223692/0/0/1.
> Cache writes (WB, FS, HDFS): 80351/0/2.
> Cache times (ACQr/m, RLS, EXP): 0.289/0.015/85.192/0.043 sec.
> HOP DAGs recompiled (PRED, SB): 0/0.
> HOP DAGs recompile time: 0.007 sec.
> Functions recompiled: 1.
> Functions recompile time: 0.039 sec.
> Total JIT compile time: 4.924 sec.
> Total JVM GC count: 312.
> Total JVM GC time: 1.12 sec.
> Heavy hitter instructions (name, time, count):
> -- 1) tsne 109.202 sec 1
> -- 2) x2p 109.189 sec 1
> -- 3) leftIndex 106.728 sec 32136
> -- 4) tsmm 0.564 sec 1
> -- 5) exp 0.376 sec 8034
> -- 6) rangeReIndex 0.201 sec 40170
> -- 7) / 0.183 sec 24103
> -- 8) * 0.161 sec 16069
> -- 9) + 0.144 sec 22840
> -- 10) uak+ 0.106 sec 8036
> 16/08/01 16:46:54 INFO api.DMLScript: END DML run 08/01/2016 16:46:54
> {code}
> -> without singlenode flag
> {code}
> > ./bin/systemml scripts/staging/tSNE.dml -stats -nvargs
> > X=/home/iyounus/workspace/tsne_python/mnist2500_X.txt Y=Y_out.txt
> > C=C_out.txt
> 16/08/01 16:52:59 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time: 127.290 sec.
> Total compilation time: 0.396 sec.
> Total execution time: 126.894 sec.
> Number of compiled MR Jobs: 1.
> Number of executed MR Jobs: 0.
> Cache hits (Mem, WB, FS, HDFS): 223693/0/0/1.
> Cache writes (WB, FS, HDFS): 80352/0/2.
> Cache times (ACQr/m, RLS, EXP): 0.421/0.016/100.974/0.041 sec.
> HOP DAGs recompiled (PRED, SB): 0/0.
> HOP DAGs recompile time: 0.009 sec.
> Functions recompiled: 1.
> Functions recompile time: 0.038 sec.
> Total JIT compile time: 4.835 sec.
> Total JVM GC count: 312.
> Total JVM GC time: 1.226 sec.
> Heavy hitter instructions (name, time, count):
> -- 1) tsne 126.426 sec 1
> -- 2) x2p 126.412 sec 1
> -- 3) leftIndex 123.982 sec 32136
> -- 4) exp 0.427 sec 8034
> -- 5) MR-Job_CSV_REBLOCK 0.412 sec 1
> -- 6) tsmm 0.308 sec 1
> -- 7) rangeReIndex 0.242 sec 40170
> -- 8) / 0.208 sec 24103
> -- 9) + 0.172 sec 22840
> -- 10) * 0.151 sec 16069
> 16/08/01 16:52:59 INFO api.DMLScript: END DML run 08/01/2016 16:52:59
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)