Imran Younus created SYSTEMML-1467: -------------------------------------- Summary: optimizer changes the output of table funciton Key: SYSTEMML-1467 URL: https://issues.apache.org/jira/browse/SYSTEMML-1467 Project: SystemML Issue Type: Bug Components: Runtime Reporter: Imran Younus
Following is small code snippet which is part of an r4ml dml script. {code} reorder_matrix = function( matrix[double] X, # the input matrix, we have to account for extra beta0 for the intercept matrix[double] B, # beta matrix[double] S # Selected ) return (matrix[double] Y) { S = t(S); whatever = ncol(X) num_empty_B = ncol(X) - nrow(B); if (num_empty_B < 0) { stop("Error: unable to re-order the matrix. Reason: B more than matrix X"); } P = table(seq(1, nrow(S)), S, ncol(X), ncol(X)); #P = table(seq(1, nrow(S)), S, whatever, whatever); Y = t(P) %*% B; } X = rand(rows=5000, cols=11, min=0, max=10, pdf="uniform") Beta = matrix("-0.048 -1.918 0.000 0.024 0.000 0.005 -0.001 1.842 0.216 0.483 0.361", rows=11, cols=1) Selected = matrix("11.000 8.000 2.000 4.000 1.000 6.000 10.000 7.000 9.000", rows=1, cols=9) B = reorder_matrix(X, Beta, Selected) print(toString(B)) {code} So this works without any problem. Matrix {{P}} is {{11x11}}, with last two rows filled with zeros. BTW, I'm running it in standalone mode. Now, if I change line 14 to this: {code} P = table(seq(1, nrow(S)), S, whatever, whatever); {code} then nothing should change because {{whatever = ncol(X)}}. But, this breaks because now the shape of {{P}} changes. The last two rows are removed because these are all zeros. Here is the explain for this case: {code} 17/04/05 17:38:40 INFO api.DMLScript: EXPLAIN (RUNTIME): # Memory Budget local/remote = 5496MB/140MB/140MB # Degree of Parallelism (vcores) local/remote = 8/1/1 PROGRAM ( size CP/MR = 0/0 ) --MAIN PROGRAM ----GENERIC (lines 20-24) [recompile=false] ------CP createvar _mVar0 scratch_space//_p26900_127.0.1.1//_t0/temp0 true MATRIX binaryblock 11 1 1000 1000 -1 copy ------CP sinit 11 1 1000 1000 -0.048 -1.918 0.000 0.024 0.000 0.005 -0.001 1.842 0.216 0.483 0.361 _mVar0.MATRIX.DOUBLE ------CP createvar _mVar1 scratch_space//_p26900_127.0.1.1//_t0/temp1 true MATRIX binaryblock 9 1 1000 1000 -1 copy ------CP sinit 9 1 1000 1000 11.000 8.000 2.000 4.000 1.000 6.000 10.000 7.000 9.000 _mVar1.MATRIX.DOUBLE ------CP createvar _mVar2 scratch_space//_p26900_127.0.1.1//_t0/temp2 true MATRIX binarycell 1 11 -1 -1 -1 copy ------CP r' _mVar0.MATRIX.DOUBLE _mVar2.MATRIX.DOUBLE 8 ------CP rmvar _mVar0 ------CP createvar _mVar3 scratch_space//_p26900_127.0.1.1//_t0/temp3 true MATRIX binarycell 9 11 -1 -1 -1 copy ------CP rexpand cast=true max=11 ignore=false dir=cols target=_mVar1 _mVar3.MATRIX.DOUBLE ------CP rmvar _mVar1 ------CP createvar _mVar4 scratch_space//_p26900_127.0.1.1//_t0/temp4 true MATRIX binarycell 1 11 -1 -1 -1 copy ------CP ba+* _mVar2.MATRIX.DOUBLE _mVar3.MATRIX.DOUBLE _mVar4.MATRIX.DOUBLE 8 ------CP rmvar _mVar2 ------CP rmvar _mVar3 ------CP createvar _mVar5 scratch_space//_p26900_127.0.1.1//_t0/temp5 true MATRIX binarycell 11 1 -1 -1 -1 copy ------CP r' _mVar4.MATRIX.DOUBLE _mVar5.MATRIX.DOUBLE 8 ------CP rmvar _mVar4 ------CP toString target=_mVar5 _Var6.SCALAR.STRING ------CP rmvar _mVar5 ------CP print _Var6.SCALAR.STRING.false _Var7.SCALAR.STRING ------CP rmvar _Var6 ------CP rmvar _Var7 {code} Now, lets revert back the definition of {{P}} to {code} P = table(seq(1, nrow(S)), S, ncol(X), ncol(X)); {code} If I comment out the {{if}} statement, then it breaks again because of the same reason. Off course, this is the "optimization" at work here. If I set the {{opetlevel}} to 0 or 1 in {{SystemML-config.xml}}, then it doesn't break. -- This message was sent by Atlassian JIRA (v6.3.15#6346)