* r⁻² should be r², of course.

Note that the following very naive version of LJ(r)

function LJ(r)
    r² = sumabs2(r)
    J = 1.0/(r²)^6 - 2.0/(r²)^3
    ∇ᵣJ = (-12.0/(r²)^6 + 12.0/(r²)^3) / r²
    return J, ∇ᵣJ
end

results in code that runs in 0.42s, which is not terrible (and such
performance would be expected for more complicated Lennard-Jones potentials
which are not 12-6).

Thanks,

Jiahao Chen
Staff Research Scientist
MIT Computer Science and Artificial Intelligence Laboratory

On Mon, Dec 1, 2014 at 9:40 PM, Jiahao Chen <[email protected]> wrote:

> I took a look at the Lennard-Jones example. On top of the memory
> allocation from array slicing, there is also a lot of overhead in passing
> lj(r) as an input function phi. There is also some overhead from computing
> size(x) in the range for the inner loop.
>
> Consider this version of the code, which is "C style" in that the loops
> are explicitly unrolled, but otherwise is not that far from the original
> lj_pretty:
>
> # definition of the Lennard-Jones potential
> function LJ(r)
>     r⁻² = sumabs2(r)
>     r⁻⁶ = 1.0/(r⁻²*r⁻²*r⁻²)
>     J = r⁻⁶*r⁻⁶ - 2.0r⁻⁶
>     ∇ᵣJ = -12. * (r⁻⁶*r⁻⁶ - r⁻⁶) / r⁻²
>     J, ∇ᵣJ #<--
> end
>
> function lj_cstyle2(x)
>     d, N = size(x)
>     E = 0.0
>     ∇E = zeros(x) #note that zeros supports this form too
>     r = zeros(d)
>     @inbounds for n = 1:(N-1), k = (n+1):N
>         for i = 1:d
>             r[i] = x[i,k]-x[i,n]
>         end
>         J, ∇ᵣJ = LJ(r) #<--
>         E += J
>         for i = 1:d
>             ∇E[i, k] += ∇ᵣJ * r[i]
>             ∇E[i, n] -= ∇ᵣJ * r[i]
>         end
>     end
>     E, ∇E #return keyword not strictly required
> end
>
> For me, lj_cstyle2 runs in 83 ms, compared to 66 ms for lj_cstyle.
>
> There is some overhead from constructing and unpacking the tuples in the
> two lines marked with  #<--, but I think it is unavoidable if you want to
> have a single external function that computes both the potential and its
> gradient.
>
> Note that this version of LJ(r) computes two numbers, not a number and a
> vector. For any central potential the gradient along r, ∇ᵣJ,  is sufficient
> to characterize the entire force ∇E,  so there is not much loss of
> generality in not returning the force in the original coordinates. In this
> case, the mathematical structure of the problem is not entirely at odds
> with its implementation in code.
>
> Thanks,
>
> Jiahao Chen
> Staff Research Scientist
> MIT Computer Science and Artificial Intelligence Laboratory
>
> On Mon, Dec 1, 2014 at 12:14 PM, Christoph Ortner <
> [email protected]> wrote:
>
>> Hi Steven,
>> Thanks for the feedback - yes it was intentional, but I am definitely
>> planning to extend the notebook with performance enhancements.
>> I was not yet aware of this distinction between Int and Integer. Thanks
>> for pointing it out.
>>     Christoph
>>
>>
>> On Monday, 1 December 2014 15:49:10 UTC, Steven G. Johnson wrote:
>>>
>>> I notice that in your finite-element notebook you use:
>>>
>>>     T = zeros(Integer, 3, 2*N^2)
>>>
>>>
>>> You really want to use Int in cases like this (Int = Int64 on 64-bit
>>> machines and Int32 on 32-bit machines).  Using Integer means that you have
>>> an array of pointers to generic integer containers, whose type must be
>>> checked at runtime (e.g. T[i] could be an Int32, Int64, BigInt, etc.).
>>>
>>> (You also have very Matlab-like code that allocates zillions of little
>>> temporary arrays in your inner loops.  I assume this is intentional, but it
>>> is definitely suboptimal performance, possibly by an order of magnitude.)
>>>
>>
>

Reply via email to