Re: [julia-users] Re: Higher order derivatives in Calculus

Stefan Karpinski Wed, 22 Jan 2014 04:27:09 -0800

I wonder if Dual shouldn't be a subtype of Real. Of course, you can
probably have Dual complex numbers as well, but maybe those can be
Complex{Dual} in that case.



On Wed, Jan 22, 2014 at 7:07 AM, Mauro <[email protected]> wrote:

> Dual is not a subtype of Real so there is no method when calling with
> Dual.  You need to define the function like so:
>
> function lambert_W(x::Number)]
> ...
> end
>
> then:
>
> julia> lambert_W(Dual(e, 1.0))
> 1.0 + 0.18393972058572117du
>
> Mauro
>
> On Wed, 2014-01-22 at 11:30, Hans W Borchers wrote:
> > Jason, very nice, and I will certainly look at DualNumbers.jl. But I have
> > doubts about your approach to the Lambert W function.
> >
> > See, lambert_newton(-1/e) returns -0.9999999434202022 and should return
> -1.
> > Below that, the real branch of Lambert W is not defined, but
> > lambert_newton(-0.5) returns 0.9706698296525587 though we are in the
> > imaginary branch.
> >
> > This is not so problematic as you could return NaN for those values. What
> > about parameters x > 20, for example
> >
> >     julia> lambert_newton(25)
> >     3.1737928264162476
> >
> > while the true value is 2.36015... And lambert_newton(1000) returns NaN,
> > true value 5.249603... That is why CASs such as Maple define the Lambert
> W
> > function through a Newton-Raphson procedure such as
> >
> >     function lambert_W(x::Real)
> >         if x < -1/exp(1); return NaN; end
> >         w0 = 1.0
> >         w1 = w0 - (w0 * exp(w0) - x)/((w0 + 1) * exp(w0) -
> >             (w0 + 2) * (w0 * exp(w0) - x)/(2 * w0 + 2))
> >         n = 1
> >         while abs(w1 - w0) > 1e-15 && n <= 20
> >             w0 = w1
> >             w1 = w0 - (w0 * exp(w0) - x)/((w0 + 1) * exp(w0) -
> >                 (w0 + 2) * (w0 * exp(w0) - x)/(2 * w0 + 2))
> >             n += 1
> >         end
> >         return w1
> >     end
> >
> > Now the value of lambert_W(-1/e) = -0.9999999911436428 is slightly better
> > than before and for x >= 20 and much higher we get correct results.
> >
> > Assume lamnbert_W is defined through this repetitive procedure, will
> > DualNumbers.jl work for this function? I did not succeed:
> >
> >     julia> lambert_W(Dual(e, 1.0))
> >     ERROR: no method lambert_W(Dual{Float64},)
> >
> > By the way. Lambert's W function at 1.0 is Gauss' Omega constant: Both
> > lambert_W and lambert_newton return 0.5671432904097838 which is correct
> to
> > the last digit.
> >
> > Hans Werner
> >
> >
> > On Wednesday, January 22, 2014 9:22:37 AM UTC+1, Jason Merrill wrote:
> >>
> >> The examples you mention are definitely interesting ones. Thanks for
> >> mentioning them.
> >>
> >> On Tuesday, January 21, 2014 4:05:15 AM UTC-8, Hans W Borchers wrote:
> >>>
> >>> Think of a special function not (yet) implemented in Julia, like
> Lambert
> >>> W, or Struve, or ... So you define it yourself, e.g. by applying
> >>> Newton-Raphson. How would you symbolically differentiate that?
> >>>
> >>
> >> Here's Newton's method for the Lambert W function:
> >>
> >>   julia> function lambert_newton(x)
> >>         y = zero(x)
> >>         for i = 1:25
> >>           y = y - (y*exp(y)-x)/(exp(y)*(1+y))
> >>         end
> >>         return y
> >>        end
> >>
> >>   julia> lambert_newton(1.0*e)
> >>   1.0
> >>
> >> You can differentiate this function to full precision by passing a Dual
> as
> >> an argument:
> >>
> >>   julia> lambert_newton(Dual(e, 1.0))
> >>   1.0 + 0.18393972058572117du
> >>
> >> Compare to the exact value [1]
> >>
> >>   julia> 1/2e
> >>   0.18393972058572117
> >>
> >> You can take a second derivative using (the master branch of [2])
> >> PowerSeries by passing a 2nd order power series as an argument, and
> using
> >> polydir twice
> >>
> >>   julia> polyder(polyder(lambert_newton(series(e, 1.0, 0.0))))
> >>   -0.05075073121372976
> >>
> >> Compare to the exact value [3]
> >>
> >>   julia> -3/8e^2
> >>   -0.050750731213729756
> >>
> >> In this case, we're off by one bit.
> >>
> >> Or your application forces you to define a procedure with if-else
> >>> constructs and loops, how to differentiate? Even an advanced "automated
> >>> differentiation" approach will most often not be capable of solving
> this
> >>> for you.
> >>>
> >>> Or imagine having to minimize a function such as
> >>>
> >>> function repelling_balls(x::Array{Float64, 1})
> >>>   if length(x) != 45; error("Arg. 'x' must be a vector of length 45.");
> >>> end
> >>>   d = 2.0
> >>>   for i = 1:14, j = (i+1):15
> >>>       s = sqrt((x[i]-x[j])^2 + (x[i+15]-x[j+15])^2 +
> (x[i+30]-x[j+30])^2)
> >>>       if s < d; d = s; end
> >>>   end
> >>>   return -d
> >>> end
> >>>
> >>> that models 15 repelling balls/points in [0, 1]^3.
> >>>
> >>
> >> As for your second example, you can extract the gradient vector element
> by
> >> element by lifting one coordinate at a time to a dual number, i.e.
> replace
> >> x[i] by Dual(x[i], 1.0), and then examining the du part of the result.
> You
> >> just need to drop the type annotation in your definition.
> >>
> >>   julia> x = rand(45) + Dual(0.0, 0.0)
> >>   julia> grad = zeros(Float64, 45)
> >>   julia> for i = 1:45
> >>              xp = x
> >>              xp[i] += Dual(0.0, 1.0)
> >>              r = repelling_balls(xp)
> >>              grad[i] = r.du
> >>            end
> >>   julia> grad
> >>
> >> The components of grad are now probably accurate to within 1 bit. This
> >> example works out a little bit funny because the box size is small
> compared
> >> to the ball size, but you can adjust the parameters to fix it. In
> >> principle, you could extract the hessian by taking directional second
> >> derivatives with PowerSeries, but I haven't fully worked that out yet.
> That
> >> could be interesting to know the stiffness about the equilibrium point
> once
> >> you've found it.
> >>
> >>
> >>> As "minimax" problem this function is smooth except for a subset of
> lower
> >>> dimension. Some optimization procedures using gradients will be able to
> >>> circumvent this subset and find a (local) minimum. How will you
> >>> differentiate that (i.e., build a gradient) if not numerically; and if
> yes,
> >>> is it worth the effort?
> >>>
> >>
> >> In both of these examples, you pay nothing in effort or performance
> >> compared to finite difference differentiation, and you avoid losing half
> >> your precision. I think that's very exciting!
> >>
> >>
> >>> Consider that an optimization or root finding procedure is a
> (hopefully)
> >>> converging process that will not return an absolutely precise result. A
> >>> function itself may be defined through an optimization or root finding
> task
> >>> and whose derivative will have a technical meaning.
> >>>
> >>> In real-world applications, and tasks Matlab is made for, you do not
> care
> >>> for 15 digits, and gradients five, six, seven digits accurate are more
> than
> >>> enough to solve your problem. Normally, your functions are not defined
> for
> >>> eps() accuracy anyway.
> >>>
> >>
> >> I should not have downplayed the significance or utility of finite
> >> difference differentiation. It is obviously tremendously important, and
> is
> >> suitable for many applications. Julia should strive to implement the
> best
> >> versions possible. Currently, different packages are at different
> points of
> >> their development.
> >>
> >> That said, it seems like you're trying to have it both ways here, in the
> >> sense that you're upset that Calculus throws away 2-3 digits compared
> to a
> >> better finite difference method, but at the same time, you're
> downplaying
> >> the significance of getting the other 7 digits to get to full precision.
> >>
> >> In a specific application, it's often obvious how many digits are
> >> relevant, and that number is definitely often less than 15. But as
> package
> >> authors, we should strive to compute to full precision when that's
> possible
> >> without undue complexity or extreme performance tradeoffs. The problem
> is
> >> that if packages A, B, and C all decide that 8 digits is good enough for
> >> anyone and so throw away half the precision of floating point arguments,
> >> and I decide to take the output of package A as input to package B, and
> the
> >> output of B as input to C, suddenly I don't have 8 digits, I have 2. I'm
> >> sure you probably understand this point already, but it's something I
> >> didn't completely appreciate until recently.
> >>
> >>
> >>> When I came here, I thought Julia was meant for "technical computing"
> and
> >>> not so much for pure Math exercises. In scientific computing,
> calculating
> >>> numerical derivatives and gradients is essential for many applications.
> >>>
> >>>
> >> I hope it will be for both. There's a lot of amazing people working on
> it.
> >>
> >> [1] First derivative of Lambert W:
> >> http://www.wolframalpha.com/input/?i=LambertW'[e]
> >> [2] The release of PowerSeries that makes this work is this PR:
> >> https://github.com/JuliaLang/METADATA.jl/pull/535
> >> [3] Second derivative of Lambert W:
> >> http://www.wolframalpha.com/input/?i=LambertW''[e]
> >>
> >>
> >>> On Tuesday, January 21, 2014 3:28:28 AM UTC+1, Jason Merrill wrote:
> >>>>
> >>>> Would you be willing to share a problem that you're interested in that
> >>>> isn't amenable to symbolic differentiation, maybe on a gist or
> something?
> >>>> I've been looking for non-trivial problems to try to apply
> PowerSeries to.
> >>>>
> >>>> When I said numerical differentiation is "unstable", I meant something
> >>>> like "hard to do without losing several digits". For example, you
> point out
> >>>> that by choosing a different h, it's possible to improve over
> Calculus.jl's
> >>>> current behavior by 2-3 decimal digits, from 7e-7 to 2e-9 in absolute
> >>>> error. But the best precision you could hope for in a floating point
> answer
> >>>> to this problem is
> >>>>
> >>>> julia> eps(sin(1.0))
> >>>> 1.1102230246251565e-16
> >>>>
> >>>> so even with your improvements, you've lost a lot of precision. That
> >>>> might or might not be a problem depending on what you'd like to do
> next
> >>>> with the value you get out.
> >>>>
> >>>>
> >>>> On Monday, January 20, 2014 12:30:38 PM UTC-8, Hans W Borchers wrote:
> >>>>>
> >>>>> Numerical differentiation is by far not as unstable as you seem to
> >>>>> think.
> >>>>> And I have a long experience in using numerical derivatives for
> >>>>> optimization problems where you don't stop to look up symbolic
> derivatives
> >>>>> applying a CAS.
> >>>>> The function obviously was only an example.
> >>>>> For most of the functions I have used Julia's symbolic capabilities
> >>>>> will not be sufficient.
> >>>>>
> >>>>>
> >>>>> On Monday, January 20, 2014 9:07:02 PM UTC+1, Jason Merrill wrote:
> >>>>>>
> >>>>>> This implementation could certainly use some love, but finite
> >>>>>> difference differentiation is always unstable, and the situation
> gets worse
> >>>>>> as you take higher order derivatives.
> >>>>>>
> >>>>>> You might want to consider using the differentiate method to take
> your
> >>>>>> derivatives symbolically (if this works for the functions you're
> using), or
> >>>>>> have a look at the differentiation example in PowerSeries.jl. To do
> a
> >>>>>> symbolic second derivative, you can do, e.g.
> >>>>>>
> >>>>>> julia> using Calculus
> >>>>>> julia> @eval d2(x) = $(differentiate(differentiate(:(sin(x)))))
> >>>>>> julia> d2(1.0)
> >>>>>> -0.8414709848078965
> >>>>>>
> >>>>>
>
> --
> --
> Mauro A Werder
> Bristol Glaciology Centre
> School of Geographical Sciences
> University of Bristol, UK
>
> Telephone: +44 (0) 117 3314154
> E-Mail:  [email protected]
> http://maurow.bitbucket.org
>
>

Re: [julia-users] Re: Higher order derivatives in Calculus

Reply via email to