Imran Younus created SYSTEMML-1467:
--------------------------------------

             Summary: optimizer changes the output of table funciton
                 Key: SYSTEMML-1467
                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1467
             Project: SystemML
          Issue Type: Bug
          Components: Runtime
            Reporter: Imran Younus


Following is small code snippet which is part of an r4ml dml script.

{code}
reorder_matrix = function(
  matrix[double] X, # the input matrix, we have to account for extra beta0 for 
the intercept
  matrix[double] B, # beta
  matrix[double] S  # Selected
) return (matrix[double] Y) {

  S = t(S);
  whatever = ncol(X)
  num_empty_B = ncol(X) - nrow(B);
  if (num_empty_B < 0) {
    stop("Error: unable to re-order the matrix. Reason: B more than matrix X");
  }

  P = table(seq(1, nrow(S)), S, ncol(X), ncol(X));
  #P = table(seq(1, nrow(S)), S, whatever, whatever);

  Y = t(P) %*% B;
}

X = rand(rows=5000, cols=11, min=0, max=10, pdf="uniform")
Beta = matrix("-0.048 -1.918 0.000 0.024 0.000 0.005 -0.001 1.842 0.216 0.483 
0.361", rows=11, cols=1)
Selected = matrix("11.000 8.000 2.000 4.000 1.000 6.000 10.000 7.000 9.000", 
rows=1, cols=9)
B = reorder_matrix(X, Beta, Selected)
print(toString(B))
{code}

So this works without any problem. Matrix {{P}} is {{11x11}}, with last two 
rows filled with zeros. BTW, I'm running it in standalone mode.

Now, if I change line 14 to this:
{code}
P = table(seq(1, nrow(S)), S, whatever, whatever);
{code}

then nothing should change because {{whatever = ncol(X)}}. But, this breaks 
because now the shape of {{P}} changes. The last two rows are removed because 
these are all zeros.

Here is the explain for this case:
{code}
17/04/05 17:38:40 INFO api.DMLScript: EXPLAIN (RUNTIME):
# Memory Budget local/remote = 5496MB/140MB/140MB
# Degree of Parallelism (vcores) local/remote = 8/1/1
PROGRAM ( size CP/MR = 0/0 )
--MAIN PROGRAM
----GENERIC (lines 20-24) [recompile=false]
------CP createvar _mVar0 scratch_space//_p26900_127.0.1.1//_t0/temp0 true 
MATRIX binaryblock 11 1 1000 1000 -1 copy
------CP sinit 11 1 1000 1000 -0.048 -1.918 0.000 0.024 0.000 0.005 -0.001 
1.842 0.216 0.483 0.361 _mVar0.MATRIX.DOUBLE
------CP createvar _mVar1 scratch_space//_p26900_127.0.1.1//_t0/temp1 true 
MATRIX binaryblock 9 1 1000 1000 -1 copy
------CP sinit 9 1 1000 1000 11.000 8.000 2.000 4.000 1.000 6.000 10.000 7.000 
9.000 _mVar1.MATRIX.DOUBLE
------CP createvar _mVar2 scratch_space//_p26900_127.0.1.1//_t0/temp2 true 
MATRIX binarycell 1 11 -1 -1 -1 copy
------CP r' _mVar0.MATRIX.DOUBLE _mVar2.MATRIX.DOUBLE 8
------CP rmvar _mVar0
------CP createvar _mVar3 scratch_space//_p26900_127.0.1.1//_t0/temp3 true 
MATRIX binarycell 9 11 -1 -1 -1 copy
------CP rexpand cast=true max=11 ignore=false dir=cols target=_mVar1 
_mVar3.MATRIX.DOUBLE
------CP rmvar _mVar1
------CP createvar _mVar4 scratch_space//_p26900_127.0.1.1//_t0/temp4 true 
MATRIX binarycell 1 11 -1 -1 -1 copy
------CP ba+* _mVar2.MATRIX.DOUBLE _mVar3.MATRIX.DOUBLE _mVar4.MATRIX.DOUBLE 8
------CP rmvar _mVar2
------CP rmvar _mVar3
------CP createvar _mVar5 scratch_space//_p26900_127.0.1.1//_t0/temp5 true 
MATRIX binarycell 11 1 -1 -1 -1 copy
------CP r' _mVar4.MATRIX.DOUBLE _mVar5.MATRIX.DOUBLE 8
------CP rmvar _mVar4
------CP toString target=_mVar5 _Var6.SCALAR.STRING
------CP rmvar _mVar5
------CP print _Var6.SCALAR.STRING.false _Var7.SCALAR.STRING
------CP rmvar _Var6
------CP rmvar _Var7
{code}

Now, lets revert back the definition of {{P}} to
{code}
P = table(seq(1, nrow(S)), S, ncol(X), ncol(X));
{code}

If I comment out the {{if}} statement, then it breaks again because of the same 
reason.

Off course, this is the "optimization" at work here. If I set the {{opetlevel}} 
to 0 or 1 in {{SystemML-config.xml}}, then it doesn't break.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to