Dear all

I am quite new to Julia, and I am experiencing very odd failures of 
associative matrix multiplication results depending on the permutation of 
the inv(), multiplication signs and whether or not an intermediate 
multiplication result is defined as a separate variable or not. 

I'm going to provide a replication with a small toy example later, but here 
is an outline of the actual regression results where this first came up 
(happy to provide data for the underlying matrices if needed)

x_dum : 104x34 Array{Real, 2}
delta0: Vector, float64, 104

julia> Px_dum = inv(x_dum'*x_dum)*x_dum'
julia> theta10 = Px_dum*delta0   
(Vector, Real, 34)
julia> theta11 = inv(x_dum'*x_dum)*(x_dum'*delta0) 
(Vector, Real, 34)

julia>theta10[1:8]
8-element Array{Real,1}:
  0.863612
  0.315274
 -0.18194 
 -0.99634 
 -3.16722 
 -2.40826 
 -1.78346 
 -4.2834  

julia>theta11[1:8]
8-element Array{Real,1}:
  9.0     
  0.315274
 -0.18194 
 -0.99634 
 -3.25    
 -7.6875  
 -5.9375  
 -7.4375  

I'm genuinely at my wits' end from trying to determine whether this is 
arising from me overlooking some basic (if counter-intuitive) Julia syntax 
rules or this is a genuine quite profound bug.
All this has been observed while running version 0.4.5 on Windows 64bit, 
and is reproducible in REPL and scripts pre-written/run through julia IDE 
in Atom. 

Now, for a toy example to demonstrate when I encounter this result more 
generally. 

julia> x=[1 2; 3 4]
julia> y=[3 4]'

Let's try a basic regression model in different formulations:

julia>  inv(x'x)x'y ==inv(x'x)*(x'y)
true
julia>  inv(x'x)*x'y ==inv(x'x)*(x'y)
true
julia>  inv(x'x)*x'y ==inv(x'x)*(x'*y)
true
julia>  inv(x'x)*x'*y ==inv(x'x)*(x'*y)
false
??? Why would this be happening?  (the same occurs with inv(x'*x) in the 
first term)

For some reason,  inv(x'x)x'y != inv(x'x)*x'*y and inv(x'x)*(x'*y) != 
inv(x'x)*x'*y  This is pretty insidious because it's quite common to use 
parentheses to keep track of variables and/or * to separate them out. 

I've tried to trace in what circumstances the issue appears:

1. Is it the inv() call that is messing with things? No. 
julia> z=inv(x'x)
julia>  z*x'*y == z*x'y
false

(This does not occur if z is some other matrix)

2. Can the issue be avoided through using \ rather than inv? No. 
julia> z = x'x
julia> inv(z)*x'*y == inv(z)*(x'*y)
false
julia> z\x'*y == z\(x'*y)
false

3. Am I messing up because I combine transposition with * (x'* sequence)? 
No.

julia> Px = inv(x'x)x'
julia> Px*y == inv(x'x)x'y
false  (whoa!)

BUT
julia> Px*y ==inv(x'x)x'*y== inv(x'x)*x'*y
true

4. Does it happen when y is a matrix instead of a vector? Yes. (Unless y is 
an identity matrix)
julia> y = [1 5; 8 5]
julia> inv(x'x)x'y  == inv(x'x)x'*y
false

So, the issue seems to be about the multiplication sign before y even 
though it makes no difference in other contexts. 

In small examples the difference might be just floating point noise (e.g. 
try 
inv(x'x)x'y  and inv(x'x)*x'*y  for x=[5 3; 9 10]  and y=[1000; 1000]). 
 But as shown above, for real-life cases the divergence can become very 
large. 

Would be incredibly grateful for any insight on why this might be happening 
and whether it's actually a bug that needs to be reported.  Apologies if 
this has come up before: I tried searching the message board and julia 
issues list on GitHub and didn't find anything relevant. 

All best

Zanna


Reply via email to