[julia-users] Re: Higher order derivatives in Calculus

Jason Merrill Wed, 22 Jan 2014 00:26:44 -0800

Should have mentioned, to make these examples work, you need to start off 
your session by doing adding DualNumbers and PowerSeries if you don't have 
them already, and "using them":


  julia> Pkg.add("DualNumbers")
  julia> Pkg.add("PowerSeries")
  julia> using DualNumbers
  julia> using PowerSeries

On Wednesday, January 22, 2014 12:22:37 AM UTC-8, Jason Merrill wrote:
>
> The examples you mention are definitely interesting ones. Thanks for 
> mentioning them.
>
> On Tuesday, January 21, 2014 4:05:15 AM UTC-8, Hans W Borchers wrote:
>>
>> Think of a special function not (yet) implemented in Julia, like Lambert 
>> W, or Struve, or ... So you define it yourself, e.g. by applying 
>> Newton-Raphson. How would you symbolically differentiate that?
>>
>
> Here's Newton's method for the Lambert W function:
>
>   julia> function lambert_newton(x)
>         y = zero(x)
>         for i = 1:25
>           y = y - (y*exp(y)-x)/(exp(y)*(1+y))
>         end
>         return y
>        end
>
>   julia> lambert_newton(1.0*e)
>   1.0
>
> You can differentiate this function to full precision by passing a Dual as 
> an argument:
>
>   julia> lambert_newton(Dual(e, 1.0))
>   1.0 + 0.18393972058572117du
>
> Compare to the exact value [1]
>   
>   julia> 1/2e
>   0.18393972058572117
>
> You can take a second derivative using (the master branch of [2]) 
> PowerSeries by passing a 2nd order power series as an argument, and using 
> polydir twice
>
>   julia> polyder(polyder(lambert_newton(series(e, 1.0, 0.0))))
>   -0.05075073121372976
>
> Compare to the exact value [3]
>
>   julia> -3/8e^2
>   -0.050750731213729756
>
> In this case, we're off by one bit.
>
> Or your application forces you to define a procedure with if-else 
>> constructs and loops, how to differentiate? Even an advanced "automated 
>> differentiation" approach will most often not be capable of solving this 
>> for you.
>>
>> Or imagine having to minimize a function such as
>>
>> function repelling_balls(x::Array{Float64, 1})
>>   if length(x) != 45; error("Arg. 'x' must be a vector of length 45."); 
>> end
>>   d = 2.0
>>   for i = 1:14, j = (i+1):15
>>       s = sqrt((x[i]-x[j])^2 + (x[i+15]-x[j+15])^2 + (x[i+30]-x[j+30])^2)
>>       if s < d; d = s; end
>>   end
>>   return -d
>> end
>>
>> that models 15 repelling balls/points in [0, 1]^3.
>>
>
> As for your second example, you can extract the gradient vector element by 
> element by lifting one coordinate at a time to a dual number, i.e. replace 
> x[i] by Dual(x[i], 1.0), and then examining the du part of the result. You 
> just need to drop the type annotation in your definition.
>
>   julia> x = rand(45) + Dual(0.0, 0.0)
>   julia> grad = zeros(Float64, 45)
>   julia> for i = 1:45
>              xp = x
>              xp[i] += Dual(0.0, 1.0)
>              r = repelling_balls(xp)
>              grad[i] = r.du
>            end
>   julia> grad
>
> The components of grad are now probably accurate to within 1 bit. This 
> example works out a little bit funny because the box size is small compared 
> to the ball size, but you can adjust the parameters to fix it. In 
> principle, you could extract the hessian by taking directional second 
> derivatives with PowerSeries, but I haven't fully worked that out yet. That 
> could be interesting to know the stiffness about the equilibrium point once 
> you've found it. 
>  
>
>> As "minimax" problem this function is smooth except for a subset of lower 
>> dimension. Some optimization procedures using gradients will be able to 
>> circumvent this subset and find a (local) minimum. How will you 
>> differentiate that (i.e., build a gradient) if not numerically; and if yes, 
>> is it worth the effort?
>>
>
> In both of these examples, you pay nothing in effort or performance 
> compared to finite difference differentiation, and you avoid losing half 
> your precision. I think that's very exciting!
>  
>
>> Consider that an optimization or root finding procedure is a (hopefully) 
>> converging process that will not return an absolutely precise result. A 
>> function itself may be defined through an optimization or root finding task 
>> and whose derivative will have a technical meaning.
>>
>> In real-world applications, and tasks Matlab is made for, you do not care 
>> for 15 digits, and gradients five, six, seven digits accurate are more than 
>> enough to solve your problem. Normally, your functions are not defined for 
>> eps() accuracy anyway.
>>
>
> I should not have downplayed the significance or utility of finite 
> difference differentiation. It is obviously tremendously important, and is 
> suitable for many applications. Julia should strive to implement the best 
> versions possible. Currently, different packages are at different points of 
> their development.
>
> That said, it seems like you're trying to have it both ways here, in the 
> sense that you're upset that Calculus throws away 2-3 digits compared to a 
> better finite difference method, but at the same time, you're downplaying 
> the significance of getting the other 7 digits to get to full precision.
>
> In a specific application, it's often obvious how many digits are 
> relevant, and that number is definitely often less than 15. But as package 
> authors, we should strive to compute to full precision when that's possible 
> without undue complexity or extreme performance tradeoffs. The problem is 
> that if packages A, B, and C all decide that 8 digits is good enough for 
> anyone and so throw away half the precision of floating point arguments, 
> and I decide to take the output of package A as input to package B, and the 
> output of B as input to C, suddenly I don't have 8 digits, I have 2. I'm 
> sure you probably understand this point already, but it's something I 
> didn't completely appreciate until recently.
>  
>
>> When I came here, I thought Julia was meant for "technical computing" and 
>> not so much for pure Math exercises. In scientific computing, calculating 
>> numerical derivatives and gradients is essential for many applications.
>>
>>
> I hope it will be for both. There's a lot of amazing people working on it.
>
> [1] First derivative of Lambert W: 
> http://www.wolframalpha.com/input/?i=LambertW'[e]
> [2] The release of PowerSeries that makes this work is this PR: 
> https://github.com/JuliaLang/METADATA.jl/pull/535
> [3] Second derivative of Lambert W: 
> http://www.wolframalpha.com/input/?i=LambertW''[e]
>  
>
>> On Tuesday, January 21, 2014 3:28:28 AM UTC+1, Jason Merrill wrote:
>>>
>>> Would you be willing to share a problem that you're interested in that 
>>> isn't amenable to symbolic differentiation, maybe on a gist or something? 
>>> I've been looking for non-trivial problems to try to apply PowerSeries to.
>>>
>>> When I said numerical differentiation is "unstable", I meant something 
>>> like "hard to do without losing several digits". For example, you point out 
>>> that by choosing a different h, it's possible to improve over Calculus.jl's 
>>> current behavior by 2-3 decimal digits, from 7e-7 to 2e-9 in absolute 
>>> error. But the best precision you could hope for in a floating point answer 
>>> to this problem is
>>>
>>> julia> eps(sin(1.0))
>>> 1.1102230246251565e-16
>>>
>>> so even with your improvements, you've lost a lot of precision. That 
>>> might or might not be a problem depending on what you'd like to do next 
>>> with the value you get out.
>>>
>>>
>>> On Monday, January 20, 2014 12:30:38 PM UTC-8, Hans W Borchers wrote:
>>>>
>>>> Numerical differentiation is by far not as unstable as you seem to 
>>>> think.
>>>> And I have a long experience in using numerical derivatives for 
>>>> optimization problems where you don't stop to look up symbolic derivatives 
>>>> applying a CAS.
>>>> The function obviously was only an example.
>>>> For most of the functions I have used Julia's symbolic capabilities 
>>>> will not be sufficient.
>>>>
>>>>
>>>> On Monday, January 20, 2014 9:07:02 PM UTC+1, Jason Merrill wrote:
>>>>>
>>>>> This implementation could certainly use some love, but finite 
>>>>> difference differentiation is always unstable, and the situation gets 
>>>>> worse 
>>>>> as you take higher order derivatives.
>>>>>
>>>>> You might want to consider using the differentiate method to take your 
>>>>> derivatives symbolically (if this works for the functions you're using), 
>>>>> or 
>>>>> have a look at the differentiation example in PowerSeries.jl. To do a 
>>>>> symbolic second derivative, you can do, e.g.
>>>>>
>>>>> julia> using Calculus
>>>>> julia> @eval d2(x) = $(differentiate(differentiate(:(sin(x)))))
>>>>> julia> d2(1.0)
>>>>> -0.8414709848078965
>>>>>
>>>>

[julia-users] Re: Higher order derivatives in Calculus

Reply via email to