Re: [julia-users] Re: Higher order derivatives in Calculus

Toivo Henningsson Wed, 22 Jan 2014 04:49:15 -0800

I wonder what the assumptions on a type that is <: Real should be. If it 
should correspond to a subset of the real line, then I guess that dual 
numbers are out.
But maybe it should be something looser, such as that it fulfills real(x) 
=== x? Or maybe there should be a common supertype of Real and Dual, 
whatever it might be called and stand for, that should be used instead of 
Real in many places?


On Wednesday, 22 January 2014 13:25:38 UTC+1, Stefan Karpinski wrote:
>
> I wonder if Dual shouldn't be a subtype of Real. Of course, you can 
> probably have Dual complex numbers as well, but maybe those can be 
> Complex{Dual} in that case.
>
>
> On Wed, Jan 22, 2014 at 7:07 AM, Mauro <[email protected] 
> <javascript:>>wrote:
>
>> Dual is not a subtype of Real so there is no method when calling with
>> Dual.  You need to define the function like so:
>>
>> function lambert_W(x::Number)]
>> ...
>> end
>>
>> then:
>>
>> julia> lambert_W(Dual(e, 1.0))
>> 1.0 + 0.18393972058572117du
>>
>> Mauro
>>
>> On Wed, 2014-01-22 at 11:30, Hans W Borchers wrote:
>> > Jason, very nice, and I will certainly look at DualNumbers.jl. But I 
>> have
>> > doubts about your approach to the Lambert W function.
>> >
>> > See, lambert_newton(-1/e) returns -0.9999999434202022 and should return 
>> -1.
>> > Below that, the real branch of Lambert W is not defined, but
>> > lambert_newton(-0.5) returns 0.9706698296525587 though we are in the
>> > imaginary branch.
>> >
>> > This is not so problematic as you could return NaN for those values. 
>> What
>> > about parameters x > 20, for example
>> >
>> >     julia> lambert_newton(25)
>> >     3.1737928264162476
>> >
>> > while the true value is 2.36015... And lambert_newton(1000) returns NaN,
>> > true value 5.249603... That is why CASs such as Maple define the 
>> Lambert W
>> > function through a Newton-Raphson procedure such as
>> >
>> >     function lambert_W(x::Real)
>> >         if x < -1/exp(1); return NaN; end
>> >         w0 = 1.0
>> >         w1 = w0 - (w0 * exp(w0) - x)/((w0 + 1) * exp(w0) -
>> >             (w0 + 2) * (w0 * exp(w0) - x)/(2 * w0 + 2))
>> >         n = 1
>> >         while abs(w1 - w0) > 1e-15 && n <= 20
>> >             w0 = w1
>> >             w1 = w0 - (w0 * exp(w0) - x)/((w0 + 1) * exp(w0) -
>> >                 (w0 + 2) * (w0 * exp(w0) - x)/(2 * w0 + 2))
>> >             n += 1
>> >         end
>> >         return w1
>> >     end
>> >
>> > Now the value of lambert_W(-1/e) = -0.9999999911436428 is slightly 
>> better
>> > than before and for x >= 20 and much higher we get correct results.
>> >
>> > Assume lamnbert_W is defined through this repetitive procedure, will
>> > DualNumbers.jl work for this function? I did not succeed:
>> >
>> >     julia> lambert_W(Dual(e, 1.0))
>> >     ERROR: no method lambert_W(Dual{Float64},)
>> >
>> > By the way. Lambert's W function at 1.0 is Gauss' Omega constant: Both
>> > lambert_W and lambert_newton return 0.5671432904097838 which is correct 
>> to
>> > the last digit.
>> >
>> > Hans Werner
>> >
>> >
>> > On Wednesday, January 22, 2014 9:22:37 AM UTC+1, Jason Merrill wrote:
>> >>
>> >> The examples you mention are definitely interesting ones. Thanks for
>> >> mentioning them.
>> >>
>> >> On Tuesday, January 21, 2014 4:05:15 AM UTC-8, Hans W Borchers wrote:
>> >>>
>> >>> Think of a special function not (yet) implemented in Julia, like 
>> Lambert
>> >>> W, or Struve, or ... So you define it yourself, e.g. by applying
>> >>> Newton-Raphson. How would you symbolically differentiate that?
>> >>>
>> >>
>> >> Here's Newton's method for the Lambert W function:
>> >>
>> >>   julia> function lambert_newton(x)
>> >>         y = zero(x)
>> >>         for i = 1:25
>> >>           y = y - (y*exp(y)-x)/(exp(y)*(1+y))
>> >>         end
>> >>         return y
>> >>        end
>> >>
>> >>   julia> lambert_newton(1.0*e)
>> >>   1.0
>> >>
>> >> You can differentiate this function to full precision by passing a 
>> Dual as
>> >> an argument:
>> >>
>> >>   julia> lambert_newton(Dual(e, 1.0))
>> >>   1.0 + 0.18393972058572117du
>> >>
>> >> Compare to the exact value [1]
>> >>
>> >>   julia> 1/2e
>> >>   0.18393972058572117
>> >>
>> >> You can take a second derivative using (the master branch of [2])
>> >> PowerSeries by passing a 2nd order power series as an argument, and 
>> using
>> >> polydir twice
>> >>
>> >>   julia> polyder(polyder(lambert_newton(series(e, 1.0, 0.0))))
>> >>   -0.05075073121372976
>> >>
>> >> Compare to the exact value [3]
>> >>
>> >>   julia> -3/8e^2
>> >>   -0.050750731213729756
>> >>
>> >> In this case, we're off by one bit.
>> >>
>> >> Or your application forces you to define a procedure with if-else
>> >>> constructs and loops, how to differentiate? Even an advanced 
>> "automated
>> >>> differentiation" approach will most often not be capable of solving 
>> this
>> >>> for you.
>> >>>
>> >>> Or imagine having to minimize a function such as
>> >>>
>> >>> function repelling_balls(x::Array{Float64, 1})
>> >>>   if length(x) != 45; error("Arg. 'x' must be a vector of length 
>> 45.");
>> >>> end
>> >>>   d = 2.0
>> >>>   for i = 1:14, j = (i+1):15
>> >>>       s = sqrt((x[i]-x[j])^2 + (x[i+15]-x[j+15])^2 + 
>> (x[i+30]-x[j+30])^2)
>> >>>       if s < d; d = s; end
>> >>>   end
>> >>>   return -d
>> >>> end
>> >>>
>> >>> that models 15 repelling balls/points in [0, 1]^3.
>> >>>
>> >>
>> >> As for your second example, you can extract the gradient vector 
>> element by
>> >> element by lifting one coordinate at a time to a dual number, i.e. 
>> replace
>> >> x[i] by Dual(x[i], 1.0), and then examining the du part of the result. 
>> You
>> >> just need to drop the type annotation in your definition.
>> >>
>> >>   julia> x = rand(45) + Dual(0.0, 0.0)
>> >>   julia> grad = zeros(Float64, 45)
>> >>   julia> for i = 1:45
>> >>              xp = x
>> >>              xp[i] += Dual(0.0, 1.0)
>> >>              r = repelling_balls(xp)
>> >>              grad[i] = r.du
>> >>            end
>> >>   julia> grad
>> >>
>> >> The components of grad are now probably accurate to within 1 bit. This
>> >> example works out a little bit funny because the box size is small 
>> compared
>> >> to the ball size, but you can adjust the parameters to fix it. In
>> >> principle, you could extract the hessian by taking directional second
>> >> derivatives with PowerSeries, but I haven't fully worked that out yet. 
>> That
>> >> could be interesting to know the stiffness about the equilibrium point 
>> once
>> >> you've found it.
>> >>
>> >>
>> >>> As "minimax" problem this function is smooth except for a subset of 
>> lower
>> >>> dimension. Some optimization procedures using gradients will be able 
>> to
>> >>> circumvent this subset and find a (local) minimum. How will you
>> >>> differentiate that (i.e., build a gradient) if not numerically; and 
>> if yes,
>> >>> is it worth the effort?
>> >>>
>> >>
>> >> In both of these examples, you pay nothing in effort or performance
>> >> compared to finite difference differentiation, and you avoid losing 
>> half
>> >> your precision. I think that's very exciting!
>> >>
>> >>
>> >>> Consider that an optimization or root finding procedure is a 
>> (hopefully)
>> >>> converging process that will not return an absolutely precise result. 
>> A
>> >>> function itself may be defined through an optimization or root 
>> finding task
>> >>> and whose derivative will have a technical meaning.
>> >>>
>> >>> In real-world applications, and tasks Matlab is made for, you do not 
>> care
>> >>> for 15 digits, and gradients five, six, seven digits accurate are 
>> more than
>> >>> enough to solve your problem. Normally, your functions are not 
>> defined for
>> >>> eps() accuracy anyway.
>> >>>
>> >>
>> >> I should not have downplayed the significance or utility of finite
>> >> difference differentiation. It is obviously tremendously important, 
>> and is
>> >> suitable for many applications. Julia should strive to implement the 
>> best
>> >> versions possible. Currently, different packages are at different 
>> points of
>> >> their development.
>> >>
>> >> That said, it seems like you're trying to have it both ways here, in 
>> the
>> >> sense that you're upset that Calculus throws away 2-3 digits compared 
>> to a
>> >> better finite difference method, but at the same time, you're 
>> downplaying
>> >> the significance of getting the other 7 digits to get to full 
>> precision.
>> >>
>> >> In a specific application, it's often obvious how many digits are
>> >> relevant, and that number is definitely often less than 15. But as 
>> package
>> >> authors, we should strive to compute to full precision when that's 
>> possible
>> >> without undue complexity or extreme performance tradeoffs. The problem 
>> is
>> >> that if packages A, B, and C all decide that 8 digits is good enough 
>> for
>> >> anyone and so throw away half the precision of floating point 
>> arguments,
>> >> and I decide to take the output of package A as input to package B, 
>> and the
>> >> output of B as input to C, suddenly I don't have 8 digits, I have 2. 
>> I'm
>> >> sure you probably understand this point already, but it's something I
>> >> didn't completely appreciate until recently.
>> >>
>> >>
>> >>> When I came here, I thought Julia was meant for "technical computing" 
>> and
>> >>> not so much for pure Math exercises. In scientific computing, 
>> calculating
>> >>> numerical derivatives and gradients is essential for many 
>> applications.
>> >>>
>> >>>
>> >> I hope it will be for both. There's a lot of amazing people working on 
>> it.
>> >>
>> >> [1] First derivative of Lambert W:
>> >> http://www.wolframalpha.com/input/?i=LambertW'[e]
>> >> [2] The release of PowerSeries that makes this work is this PR:
>> >> https://github.com/JuliaLang/METADATA.jl/pull/535
>> >> [3] Second derivative of Lambert W:
>> >> http://www.wolframalpha.com/input/?i=LambertW''[e]
>> >>
>> >>
>> >>> On Tuesday, January 21, 2014 3:28:28 AM UTC+1, Jason Merrill wrote:
>> >>>>
>> >>>> Would you be willing to share a problem that you're interested in 
>> that
>> >>>> isn't amenable to symbolic differentiation, maybe on a gist or 
>> something?
>> >>>> I've been looking for non-trivial problems to try to apply 
>> PowerSeries to.
>> >>>>
>> >>>> When I said numerical differentiation is "unstable", I meant 
>> something
>> >>>> like "hard to do without losing several digits". For example, you 
>> point out
>> >>>> that by choosing a different h, it's possible to improve over 
>> Calculus.jl's
>> >>>> current behavior by 2-3 decimal digits, from 7e-7 to 2e-9 in absolute
>> >>>> error. But the best precision you could hope for in a floating point 
>> answer
>> >>>> to this problem is
>> >>>>
>> >>>> julia> eps(sin(1.0))
>> >>>> 1.1102230246251565e-16
>> >>>>
>> >>>> so even with your improvements, you've lost a lot of precision. That
>> >>>> might or might not be a problem depending on what you'd like to do 
>> next
>> >>>> with the value you get out.
>> >>>>
>> >>>>
>> >>>> On Monday, January 20, 2014 12:30:38 PM UTC-8, Hans W Borchers wrote:
>> >>>>>
>> >>>>> Numerical differentiation is by far not as unstable as you seem to
>> >>>>> think.
>> >>>>> And I have a long experience in using numerical derivatives for
>> >>>>> optimization problems where you don't stop to look up symbolic 
>> derivatives
>> >>>>> applying a CAS.
>> >>>>> The function obviously was only an example.
>> >>>>> For most of the functions I have used Julia's symbolic capabilities
>> >>>>> will not be sufficient.
>> >>>>>
>> >>>>>
>> >>>>> On Monday, January 20, 2014 9:07:02 PM UTC+1, Jason Merrill wrote:
>> >>>>>>
>> >>>>>> This implementation could certainly use some love, but finite
>> >>>>>> difference differentiation is always unstable, and the situation 
>> gets worse
>> >>>>>> as you take higher order derivatives.
>> >>>>>>
>> >>>>>> You might want to consider using the differentiate method to take 
>> your
>> >>>>>> derivatives symbolically (if this works for the functions you're 
>> using), or
>> >>>>>> have a look at the differentiation example in PowerSeries.jl. To 
>> do a
>> >>>>>> symbolic second derivative, you can do, e.g.
>> >>>>>>
>> >>>>>> julia> using Calculus
>> >>>>>> julia> @eval d2(x) = $(differentiate(differentiate(:(sin(x)))))
>> >>>>>> julia> d2(1.0)
>> >>>>>> -0.8414709848078965
>> >>>>>>
>> >>>>>
>>
>> --
>> --
>> Mauro A Werder
>> Bristol Glaciology Centre
>> School of Geographical Sciences
>> University of Bristol, UK
>>
>> Telephone: +44 (0) 117 3314154
>> E-Mail:  [email protected] <javascript:>
>> http://maurow.bitbucket.org
>>
>>
>

Re: [julia-users] Re: Higher order derivatives in Calculus

Reply via email to