This might be a bit faster:
function sub!(A,B,C)
for j=1:size(A,2)
for i=1:size(A,1)
@inbounds C[i,j] = A[i,j] - B[i,j]
end
end
endC = zeros(size(A)); sub!(A,B,C) Do you have enough RAM to store these matrices though ? 10^5 * 10^5 Float64 seems rather large.
