Re: [julia-users] Re: Higher order derivatives in Calculus

Mauro Wed, 22 Jan 2014 04:08:08 -0800

Dual is not a subtype of Real so there is no method when calling with
Dual.  You need to define the function like so:


function lambert_W(x::Number)]
...
end

then:

julia> lambert_W(Dual(e, 1.0))
1.0 + 0.18393972058572117du

Mauro

On Wed, 2014-01-22 at 11:30, Hans W Borchers wrote:
> Jason, very nice, and I will certainly look at DualNumbers.jl. But I have 
> doubts about your approach to the Lambert W function.
>
> See, lambert_newton(-1/e) returns -0.9999999434202022 and should return -1.
> Below that, the real branch of Lambert W is not defined, but 
> lambert_newton(-0.5) returns 0.9706698296525587 though we are in the 
> imaginary branch.
>
> This is not so problematic as you could return NaN for those values. What 
> about parameters x > 20, for example
>
>     julia> lambert_newton(25)
>     3.1737928264162476
>
> while the true value is 2.36015... And lambert_newton(1000) returns NaN, 
> true value 5.249603... That is why CASs such as Maple define the Lambert W 
> function through a Newton-Raphson procedure such as
>
>     function lambert_W(x::Real)
>         if x < -1/exp(1); return NaN; end
>         w0 = 1.0
>         w1 = w0 - (w0 * exp(w0) - x)/((w0 + 1) * exp(w0) - 
>             (w0 + 2) * (w0 * exp(w0) - x)/(2 * w0 + 2))
>         n = 1
>         while abs(w1 - w0) > 1e-15 && n <= 20
>             w0 = w1
>             w1 = w0 - (w0 * exp(w0) - x)/((w0 + 1) * exp(w0) - 
>                 (w0 + 2) * (w0 * exp(w0) - x)/(2 * w0 + 2))
>             n += 1
>         end
>         return w1
>     end
>
> Now the value of lambert_W(-1/e) = -0.9999999911436428 is slightly better 
> than before and for x >= 20 and much higher we get correct results.
>
> Assume lamnbert_W is defined through this repetitive procedure, will 
> DualNumbers.jl work for this function? I did not succeed:
>
>     julia> lambert_W(Dual(e, 1.0))
>     ERROR: no method lambert_W(Dual{Float64},)
>
> By the way. Lambert's W function at 1.0 is Gauss' Omega constant: Both 
> lambert_W and lambert_newton return 0.5671432904097838 which is correct to 
> the last digit.
>
> Hans Werner
>
>
> On Wednesday, January 22, 2014 9:22:37 AM UTC+1, Jason Merrill wrote:
>>
>> The examples you mention are definitely interesting ones. Thanks for 
>> mentioning them.
>>
>> On Tuesday, January 21, 2014 4:05:15 AM UTC-8, Hans W Borchers wrote:
>>>
>>> Think of a special function not (yet) implemented in Julia, like Lambert 
>>> W, or Struve, or ... So you define it yourself, e.g. by applying 
>>> Newton-Raphson. How would you symbolically differentiate that?
>>>
>>
>> Here's Newton's method for the Lambert W function:
>>
>>   julia> function lambert_newton(x)
>>         y = zero(x)
>>         for i = 1:25
>>           y = y - (y*exp(y)-x)/(exp(y)*(1+y))
>>         end
>>         return y
>>        end
>>
>>   julia> lambert_newton(1.0*e)
>>   1.0
>>
>> You can differentiate this function to full precision by passing a Dual as 
>> an argument:
>>
>>   julia> lambert_newton(Dual(e, 1.0))
>>   1.0 + 0.18393972058572117du
>>
>> Compare to the exact value [1]
>>   
>>   julia> 1/2e
>>   0.18393972058572117
>>
>> You can take a second derivative using (the master branch of [2]) 
>> PowerSeries by passing a 2nd order power series as an argument, and using 
>> polydir twice
>>
>>   julia> polyder(polyder(lambert_newton(series(e, 1.0, 0.0))))
>>   -0.05075073121372976
>>
>> Compare to the exact value [3]
>>
>>   julia> -3/8e^2
>>   -0.050750731213729756
>>
>> In this case, we're off by one bit.
>>
>> Or your application forces you to define a procedure with if-else 
>>> constructs and loops, how to differentiate? Even an advanced "automated 
>>> differentiation" approach will most often not be capable of solving this 
>>> for you.
>>>
>>> Or imagine having to minimize a function such as
>>>
>>> function repelling_balls(x::Array{Float64, 1})
>>>   if length(x) != 45; error("Arg. 'x' must be a vector of length 45."); 
>>> end
>>>   d = 2.0
>>>   for i = 1:14, j = (i+1):15
>>>       s = sqrt((x[i]-x[j])^2 + (x[i+15]-x[j+15])^2 + (x[i+30]-x[j+30])^2)
>>>       if s < d; d = s; end
>>>   end
>>>   return -d
>>> end
>>>
>>> that models 15 repelling balls/points in [0, 1]^3.
>>>
>>
>> As for your second example, you can extract the gradient vector element by 
>> element by lifting one coordinate at a time to a dual number, i.e. replace 
>> x[i] by Dual(x[i], 1.0), and then examining the du part of the result. You 
>> just need to drop the type annotation in your definition.
>>
>>   julia> x = rand(45) + Dual(0.0, 0.0)
>>   julia> grad = zeros(Float64, 45)
>>   julia> for i = 1:45
>>              xp = x
>>              xp[i] += Dual(0.0, 1.0)
>>              r = repelling_balls(xp)
>>              grad[i] = r.du
>>            end
>>   julia> grad
>>
>> The components of grad are now probably accurate to within 1 bit. This 
>> example works out a little bit funny because the box size is small compared 
>> to the ball size, but you can adjust the parameters to fix it. In 
>> principle, you could extract the hessian by taking directional second 
>> derivatives with PowerSeries, but I haven't fully worked that out yet. That 
>> could be interesting to know the stiffness about the equilibrium point once 
>> you've found it. 
>>  
>>
>>> As "minimax" problem this function is smooth except for a subset of lower 
>>> dimension. Some optimization procedures using gradients will be able to 
>>> circumvent this subset and find a (local) minimum. How will you 
>>> differentiate that (i.e., build a gradient) if not numerically; and if yes, 
>>> is it worth the effort?
>>>
>>
>> In both of these examples, you pay nothing in effort or performance 
>> compared to finite difference differentiation, and you avoid losing half 
>> your precision. I think that's very exciting!
>>  
>>
>>> Consider that an optimization or root finding procedure is a (hopefully) 
>>> converging process that will not return an absolutely precise result. A 
>>> function itself may be defined through an optimization or root finding task 
>>> and whose derivative will have a technical meaning.
>>>
>>> In real-world applications, and tasks Matlab is made for, you do not care 
>>> for 15 digits, and gradients five, six, seven digits accurate are more than 
>>> enough to solve your problem. Normally, your functions are not defined for 
>>> eps() accuracy anyway.
>>>
>>
>> I should not have downplayed the significance or utility of finite 
>> difference differentiation. It is obviously tremendously important, and is 
>> suitable for many applications. Julia should strive to implement the best 
>> versions possible. Currently, different packages are at different points of 
>> their development.
>>
>> That said, it seems like you're trying to have it both ways here, in the 
>> sense that you're upset that Calculus throws away 2-3 digits compared to a 
>> better finite difference method, but at the same time, you're downplaying 
>> the significance of getting the other 7 digits to get to full precision.
>>
>> In a specific application, it's often obvious how many digits are 
>> relevant, and that number is definitely often less than 15. But as package 
>> authors, we should strive to compute to full precision when that's possible 
>> without undue complexity or extreme performance tradeoffs. The problem is 
>> that if packages A, B, and C all decide that 8 digits is good enough for 
>> anyone and so throw away half the precision of floating point arguments, 
>> and I decide to take the output of package A as input to package B, and the 
>> output of B as input to C, suddenly I don't have 8 digits, I have 2. I'm 
>> sure you probably understand this point already, but it's something I 
>> didn't completely appreciate until recently.
>>  
>>
>>> When I came here, I thought Julia was meant for "technical computing" and 
>>> not so much for pure Math exercises. In scientific computing, calculating 
>>> numerical derivatives and gradients is essential for many applications.
>>>
>>>
>> I hope it will be for both. There's a lot of amazing people working on it.
>>
>> [1] First derivative of Lambert W: 
>> http://www.wolframalpha.com/input/?i=LambertW'[e]
>> [2] The release of PowerSeries that makes this work is this PR: 
>> https://github.com/JuliaLang/METADATA.jl/pull/535
>> [3] Second derivative of Lambert W: 
>> http://www.wolframalpha.com/input/?i=LambertW''[e]
>>  
>>
>>> On Tuesday, January 21, 2014 3:28:28 AM UTC+1, Jason Merrill wrote:
>>>>
>>>> Would you be willing to share a problem that you're interested in that 
>>>> isn't amenable to symbolic differentiation, maybe on a gist or something? 
>>>> I've been looking for non-trivial problems to try to apply PowerSeries to.
>>>>
>>>> When I said numerical differentiation is "unstable", I meant something 
>>>> like "hard to do without losing several digits". For example, you point 
>>>> out 
>>>> that by choosing a different h, it's possible to improve over 
>>>> Calculus.jl's 
>>>> current behavior by 2-3 decimal digits, from 7e-7 to 2e-9 in absolute 
>>>> error. But the best precision you could hope for in a floating point 
>>>> answer 
>>>> to this problem is
>>>>
>>>> julia> eps(sin(1.0))
>>>> 1.1102230246251565e-16
>>>>
>>>> so even with your improvements, you've lost a lot of precision. That 
>>>> might or might not be a problem depending on what you'd like to do next 
>>>> with the value you get out.
>>>>
>>>>
>>>> On Monday, January 20, 2014 12:30:38 PM UTC-8, Hans W Borchers wrote:
>>>>>
>>>>> Numerical differentiation is by far not as unstable as you seem to 
>>>>> think.
>>>>> And I have a long experience in using numerical derivatives for 
>>>>> optimization problems where you don't stop to look up symbolic 
>>>>> derivatives 
>>>>> applying a CAS.
>>>>> The function obviously was only an example.
>>>>> For most of the functions I have used Julia's symbolic capabilities 
>>>>> will not be sufficient.
>>>>>
>>>>>
>>>>> On Monday, January 20, 2014 9:07:02 PM UTC+1, Jason Merrill wrote:
>>>>>>
>>>>>> This implementation could certainly use some love, but finite 
>>>>>> difference differentiation is always unstable, and the situation gets 
>>>>>> worse 
>>>>>> as you take higher order derivatives.
>>>>>>
>>>>>> You might want to consider using the differentiate method to take your 
>>>>>> derivatives symbolically (if this works for the functions you're using), 
>>>>>> or 
>>>>>> have a look at the differentiation example in PowerSeries.jl. To do a 
>>>>>> symbolic second derivative, you can do, e.g.
>>>>>>
>>>>>> julia> using Calculus
>>>>>> julia> @eval d2(x) = $(differentiate(differentiate(:(sin(x)))))
>>>>>> julia> d2(1.0)
>>>>>> -0.8414709848078965
>>>>>>
>>>>>

-- 
-- 
Mauro A Werder
Bristol Glaciology Centre
School of Geographical Sciences
University of Bristol, UK

Telephone: +44 (0) 117 3314154
E-Mail:  [email protected]
http://maurow.bitbucket.org

Re: [julia-users] Re: Higher order derivatives in Calculus

Reply via email to