Using a dot-product may result in an increase in execution time because the dot product loop would access the data row-wise. Arrays are stored in column-major order so row-wise access could cause cache misses.
On Thursday, November 6, 2014 12:36:10 PM UTC-6, Sebastian Good wrote: > > Agreed! Axis-reducing dot product operators might be a reasonable addition > to the standard library, especially since BLAS provides the (presumably > highly optimized or even multi-threaded) dot product primitive, which a > library function could easily sub out to using the appropriate strides. > > *Sebastian Good* > > > On Thu, Nov 6, 2014 at 1:10 PM, Douglas Bates <[email protected] > <javascript:>> wrote: > >> On Thursday, November 6, 2014 10:48:26 AM UTC-6, Sebastian Good wrote: >>> >>> A good solution for this particular problem, though presumably uses more >>> memory than a dedicated axis-aware dot product method. >>> >> >> You are correct that the method I showed does create a matrix of the same >> size as `a` and `b` to evaluate the `.+` operation. You can avoid doing so >> but, of course, the code becomes more opaque. I think the point for me is >> that if `a` and `b` are so large that the allocation and freeing of the >> memory becomes problematic, I can write the space conserving version in >> Julia and get performance comparable to compiled code. Lately when >> describing Julia to colleagues I mention the type system and multiple >> dispatch and several other aspects showing how well-designed Julia is. But >> the point that I emphasize is "one language", which sometimes I extend to >> "One language to rule them all" (I assume everyone is familiar with "Lord >> of the Rings"). I can write Julia code at in a high-level, vectorized >> style (like R, Matlab/Octave) but I can also, if I need to, write low-level >> iterative code in Julia. I don't need to use a compiled language write >> interface code. >> >> If I have very large arrays, perhaps even memory-mapped arrays because >> they are so large, I could define a function >> >> function rowdot{T}(a::DenseMatrix{T},b::DenseMatrix{T}) >> ((m,n) = size(a)) == size(b) || throw(DimensionMismatch("")) >> res = zeros(T,(m,)) >> for j in 1:n, i in 1:m >> res[i] += a[i,j] * b[i,j] >> end >> res >> end >> >> that avoided creating the temporary. Once I convinced myself that there >> were no problems in the code (and my first version did indeed have a bug) I >> could change the loop to >> >> @simd for j in 1:n, i in 1:m >> @inbounds res[i] += a[i,j] * b[i,j] >> end >> >> and improve the performance. In the next iteration I could use >> SharedArrays and parallelize the calculation if it really needed it. >> >> As a programmer I am grateful for the incredible freedom that Julia gives >> me to get as obsessive compulsive about performance as I want. >> >> Thanks! >>> >> >> You're welcome. >> >> >>> >>> On Thursday, November 6, 2014 10:42:26 AM UTC-5, Douglas Bates wrote: >>>> >>>> >>>> >>>> On Thursday, November 6, 2014 9:14:10 AM UTC-6, Sebastian Good wrote: >>>>> >>>>> Working through the excellent coursera machine-learning course, I >>>>> found myself using the row-wise (axis-wise) dot product in Octave, but >>>>> found there was no obvious equivalent in Julia. >>>>> >>>>> In Octave/Matlab, one can call dot(a,b,2) to get the row-wise dot >>>>> product of two mxn matrices, returned as a new column vector of size mx1. >>>>> >>>>> Even though Julia makes for loops faster, I like sum(dot(a,b,2)) for >>>>> its concision over the equivalent array comprehension or explicit for >>>>> loop. >>>>> >>>>> Hopefully I'm just missing an overload or alternate name? >>>>> >>>> >>>> >>>> julia> a = rand(10,4) >>>> 10x4 Array{Float64,2}: >>>> 0.134279 0.135088 0.33185 0.956108 >>>> 0.977812 0.219557 0.887589 0.468597 >>>> 0.69524 0.310889 0.449669 0.717189 >>>> 0.385896 0.675195 0.0810221 0.179553 >>>> 0.717348 0.138556 0.52147 0.458516 >>>> 0.821631 0.337048 0.367002 0.320554 >>>> 0.531433 0.0298744 0.344748 0.722242 >>>> 0.708596 0.550999 0.629017 0.787594 >>>> 0.803008 0.380515 0.729874 0.744713 >>>> 0.166205 0.5589 0.605327 0.246186 >>>> >>>> julia> b = randn(10,4) >>>> 10x4 Array{Float64,2}: >>>> 0.551047 -0.284285 -1.33048 0.0216755 >>>> -1.16133 -0.552537 0.395243 -1.72303 >>>> -0.0181444 -0.481539 -0.985497 0.352999 >>>> 1.20222 -0.557973 0.428804 -1.1013 >>>> 2.31078 0.0909548 0.329372 0.651853 >>>> 0.341906 -0.109811 -0.360118 0.550494 >>>> 0.988644 1.02413 0.570208 0.48143 >>>> -1.75465 0.147909 -1.35159 0.89136 >>>> -0.105066 -1.04501 -0.682836 0.600948 >>>> 0.556118 -1.24914 -2.45667 -1.02942 >>>> >>>> julia> sum(a .* b,2) >>>> 10x1 Array{Float64,2}: >>>> -0.385205 >>>> -1.71347 >>>> -0.3523 >>>> -0.0758094 >>>> 2.14088 >>>> 0.288209 >>>> 1.10028 >>>> -1.30999 >>>> -0.532863 >>>> -2.34624 >>>> >>>> >>>> >
