I wonder what the assumptions on a type that is <: Real should be. If it
should correspond to a subset of the real line, then I guess that dual
numbers are out.
But maybe it should be something looser, such as that it fulfills real(x)
=== x? Or maybe there should be a common supertype of Real and Dual,
whatever it might be called and stand for, that should be used instead of
Real in many places?
On Wednesday, 22 January 2014 13:25:38 UTC+1, Stefan Karpinski wrote:
>
> I wonder if Dual shouldn't be a subtype of Real. Of course, you can
> probably have Dual complex numbers as well, but maybe those can be
> Complex{Dual} in that case.
>
>
> On Wed, Jan 22, 2014 at 7:07 AM, Mauro <[email protected]
> <javascript:>>wrote:
>
>> Dual is not a subtype of Real so there is no method when calling with
>> Dual. You need to define the function like so:
>>
>> function lambert_W(x::Number)]
>> ...
>> end
>>
>> then:
>>
>> julia> lambert_W(Dual(e, 1.0))
>> 1.0 + 0.18393972058572117du
>>
>> Mauro
>>
>> On Wed, 2014-01-22 at 11:30, Hans W Borchers wrote:
>> > Jason, very nice, and I will certainly look at DualNumbers.jl. But I
>> have
>> > doubts about your approach to the Lambert W function.
>> >
>> > See, lambert_newton(-1/e) returns -0.9999999434202022 and should return
>> -1.
>> > Below that, the real branch of Lambert W is not defined, but
>> > lambert_newton(-0.5) returns 0.9706698296525587 though we are in the
>> > imaginary branch.
>> >
>> > This is not so problematic as you could return NaN for those values.
>> What
>> > about parameters x > 20, for example
>> >
>> > julia> lambert_newton(25)
>> > 3.1737928264162476
>> >
>> > while the true value is 2.36015... And lambert_newton(1000) returns NaN,
>> > true value 5.249603... That is why CASs such as Maple define the
>> Lambert W
>> > function through a Newton-Raphson procedure such as
>> >
>> > function lambert_W(x::Real)
>> > if x < -1/exp(1); return NaN; end
>> > w0 = 1.0
>> > w1 = w0 - (w0 * exp(w0) - x)/((w0 + 1) * exp(w0) -
>> > (w0 + 2) * (w0 * exp(w0) - x)/(2 * w0 + 2))
>> > n = 1
>> > while abs(w1 - w0) > 1e-15 && n <= 20
>> > w0 = w1
>> > w1 = w0 - (w0 * exp(w0) - x)/((w0 + 1) * exp(w0) -
>> > (w0 + 2) * (w0 * exp(w0) - x)/(2 * w0 + 2))
>> > n += 1
>> > end
>> > return w1
>> > end
>> >
>> > Now the value of lambert_W(-1/e) = -0.9999999911436428 is slightly
>> better
>> > than before and for x >= 20 and much higher we get correct results.
>> >
>> > Assume lamnbert_W is defined through this repetitive procedure, will
>> > DualNumbers.jl work for this function? I did not succeed:
>> >
>> > julia> lambert_W(Dual(e, 1.0))
>> > ERROR: no method lambert_W(Dual{Float64},)
>> >
>> > By the way. Lambert's W function at 1.0 is Gauss' Omega constant: Both
>> > lambert_W and lambert_newton return 0.5671432904097838 which is correct
>> to
>> > the last digit.
>> >
>> > Hans Werner
>> >
>> >
>> > On Wednesday, January 22, 2014 9:22:37 AM UTC+1, Jason Merrill wrote:
>> >>
>> >> The examples you mention are definitely interesting ones. Thanks for
>> >> mentioning them.
>> >>
>> >> On Tuesday, January 21, 2014 4:05:15 AM UTC-8, Hans W Borchers wrote:
>> >>>
>> >>> Think of a special function not (yet) implemented in Julia, like
>> Lambert
>> >>> W, or Struve, or ... So you define it yourself, e.g. by applying
>> >>> Newton-Raphson. How would you symbolically differentiate that?
>> >>>
>> >>
>> >> Here's Newton's method for the Lambert W function:
>> >>
>> >> julia> function lambert_newton(x)
>> >> y = zero(x)
>> >> for i = 1:25
>> >> y = y - (y*exp(y)-x)/(exp(y)*(1+y))
>> >> end
>> >> return y
>> >> end
>> >>
>> >> julia> lambert_newton(1.0*e)
>> >> 1.0
>> >>
>> >> You can differentiate this function to full precision by passing a
>> Dual as
>> >> an argument:
>> >>
>> >> julia> lambert_newton(Dual(e, 1.0))
>> >> 1.0 + 0.18393972058572117du
>> >>
>> >> Compare to the exact value [1]
>> >>
>> >> julia> 1/2e
>> >> 0.18393972058572117
>> >>
>> >> You can take a second derivative using (the master branch of [2])
>> >> PowerSeries by passing a 2nd order power series as an argument, and
>> using
>> >> polydir twice
>> >>
>> >> julia> polyder(polyder(lambert_newton(series(e, 1.0, 0.0))))
>> >> -0.05075073121372976
>> >>
>> >> Compare to the exact value [3]
>> >>
>> >> julia> -3/8e^2
>> >> -0.050750731213729756
>> >>
>> >> In this case, we're off by one bit.
>> >>
>> >> Or your application forces you to define a procedure with if-else
>> >>> constructs and loops, how to differentiate? Even an advanced
>> "automated
>> >>> differentiation" approach will most often not be capable of solving
>> this
>> >>> for you.
>> >>>
>> >>> Or imagine having to minimize a function such as
>> >>>
>> >>> function repelling_balls(x::Array{Float64, 1})
>> >>> if length(x) != 45; error("Arg. 'x' must be a vector of length
>> 45.");
>> >>> end
>> >>> d = 2.0
>> >>> for i = 1:14, j = (i+1):15
>> >>> s = sqrt((x[i]-x[j])^2 + (x[i+15]-x[j+15])^2 +
>> (x[i+30]-x[j+30])^2)
>> >>> if s < d; d = s; end
>> >>> end
>> >>> return -d
>> >>> end
>> >>>
>> >>> that models 15 repelling balls/points in [0, 1]^3.
>> >>>
>> >>
>> >> As for your second example, you can extract the gradient vector
>> element by
>> >> element by lifting one coordinate at a time to a dual number, i.e.
>> replace
>> >> x[i] by Dual(x[i], 1.0), and then examining the du part of the result.
>> You
>> >> just need to drop the type annotation in your definition.
>> >>
>> >> julia> x = rand(45) + Dual(0.0, 0.0)
>> >> julia> grad = zeros(Float64, 45)
>> >> julia> for i = 1:45
>> >> xp = x
>> >> xp[i] += Dual(0.0, 1.0)
>> >> r = repelling_balls(xp)
>> >> grad[i] = r.du
>> >> end
>> >> julia> grad
>> >>
>> >> The components of grad are now probably accurate to within 1 bit. This
>> >> example works out a little bit funny because the box size is small
>> compared
>> >> to the ball size, but you can adjust the parameters to fix it. In
>> >> principle, you could extract the hessian by taking directional second
>> >> derivatives with PowerSeries, but I haven't fully worked that out yet.
>> That
>> >> could be interesting to know the stiffness about the equilibrium point
>> once
>> >> you've found it.
>> >>
>> >>
>> >>> As "minimax" problem this function is smooth except for a subset of
>> lower
>> >>> dimension. Some optimization procedures using gradients will be able
>> to
>> >>> circumvent this subset and find a (local) minimum. How will you
>> >>> differentiate that (i.e., build a gradient) if not numerically; and
>> if yes,
>> >>> is it worth the effort?
>> >>>
>> >>
>> >> In both of these examples, you pay nothing in effort or performance
>> >> compared to finite difference differentiation, and you avoid losing
>> half
>> >> your precision. I think that's very exciting!
>> >>
>> >>
>> >>> Consider that an optimization or root finding procedure is a
>> (hopefully)
>> >>> converging process that will not return an absolutely precise result.
>> A
>> >>> function itself may be defined through an optimization or root
>> finding task
>> >>> and whose derivative will have a technical meaning.
>> >>>
>> >>> In real-world applications, and tasks Matlab is made for, you do not
>> care
>> >>> for 15 digits, and gradients five, six, seven digits accurate are
>> more than
>> >>> enough to solve your problem. Normally, your functions are not
>> defined for
>> >>> eps() accuracy anyway.
>> >>>
>> >>
>> >> I should not have downplayed the significance or utility of finite
>> >> difference differentiation. It is obviously tremendously important,
>> and is
>> >> suitable for many applications. Julia should strive to implement the
>> best
>> >> versions possible. Currently, different packages are at different
>> points of
>> >> their development.
>> >>
>> >> That said, it seems like you're trying to have it both ways here, in
>> the
>> >> sense that you're upset that Calculus throws away 2-3 digits compared
>> to a
>> >> better finite difference method, but at the same time, you're
>> downplaying
>> >> the significance of getting the other 7 digits to get to full
>> precision.
>> >>
>> >> In a specific application, it's often obvious how many digits are
>> >> relevant, and that number is definitely often less than 15. But as
>> package
>> >> authors, we should strive to compute to full precision when that's
>> possible
>> >> without undue complexity or extreme performance tradeoffs. The problem
>> is
>> >> that if packages A, B, and C all decide that 8 digits is good enough
>> for
>> >> anyone and so throw away half the precision of floating point
>> arguments,
>> >> and I decide to take the output of package A as input to package B,
>> and the
>> >> output of B as input to C, suddenly I don't have 8 digits, I have 2.
>> I'm
>> >> sure you probably understand this point already, but it's something I
>> >> didn't completely appreciate until recently.
>> >>
>> >>
>> >>> When I came here, I thought Julia was meant for "technical computing"
>> and
>> >>> not so much for pure Math exercises. In scientific computing,
>> calculating
>> >>> numerical derivatives and gradients is essential for many
>> applications.
>> >>>
>> >>>
>> >> I hope it will be for both. There's a lot of amazing people working on
>> it.
>> >>
>> >> [1] First derivative of Lambert W:
>> >> http://www.wolframalpha.com/input/?i=LambertW'[e]
>> >> [2] The release of PowerSeries that makes this work is this PR:
>> >> https://github.com/JuliaLang/METADATA.jl/pull/535
>> >> [3] Second derivative of Lambert W:
>> >> http://www.wolframalpha.com/input/?i=LambertW''[e]
>> >>
>> >>
>> >>> On Tuesday, January 21, 2014 3:28:28 AM UTC+1, Jason Merrill wrote:
>> >>>>
>> >>>> Would you be willing to share a problem that you're interested in
>> that
>> >>>> isn't amenable to symbolic differentiation, maybe on a gist or
>> something?
>> >>>> I've been looking for non-trivial problems to try to apply
>> PowerSeries to.
>> >>>>
>> >>>> When I said numerical differentiation is "unstable", I meant
>> something
>> >>>> like "hard to do without losing several digits". For example, you
>> point out
>> >>>> that by choosing a different h, it's possible to improve over
>> Calculus.jl's
>> >>>> current behavior by 2-3 decimal digits, from 7e-7 to 2e-9 in absolute
>> >>>> error. But the best precision you could hope for in a floating point
>> answer
>> >>>> to this problem is
>> >>>>
>> >>>> julia> eps(sin(1.0))
>> >>>> 1.1102230246251565e-16
>> >>>>
>> >>>> so even with your improvements, you've lost a lot of precision. That
>> >>>> might or might not be a problem depending on what you'd like to do
>> next
>> >>>> with the value you get out.
>> >>>>
>> >>>>
>> >>>> On Monday, January 20, 2014 12:30:38 PM UTC-8, Hans W Borchers wrote:
>> >>>>>
>> >>>>> Numerical differentiation is by far not as unstable as you seem to
>> >>>>> think.
>> >>>>> And I have a long experience in using numerical derivatives for
>> >>>>> optimization problems where you don't stop to look up symbolic
>> derivatives
>> >>>>> applying a CAS.
>> >>>>> The function obviously was only an example.
>> >>>>> For most of the functions I have used Julia's symbolic capabilities
>> >>>>> will not be sufficient.
>> >>>>>
>> >>>>>
>> >>>>> On Monday, January 20, 2014 9:07:02 PM UTC+1, Jason Merrill wrote:
>> >>>>>>
>> >>>>>> This implementation could certainly use some love, but finite
>> >>>>>> difference differentiation is always unstable, and the situation
>> gets worse
>> >>>>>> as you take higher order derivatives.
>> >>>>>>
>> >>>>>> You might want to consider using the differentiate method to take
>> your
>> >>>>>> derivatives symbolically (if this works for the functions you're
>> using), or
>> >>>>>> have a look at the differentiation example in PowerSeries.jl. To
>> do a
>> >>>>>> symbolic second derivative, you can do, e.g.
>> >>>>>>
>> >>>>>> julia> using Calculus
>> >>>>>> julia> @eval d2(x) = $(differentiate(differentiate(:(sin(x)))))
>> >>>>>> julia> d2(1.0)
>> >>>>>> -0.8414709848078965
>> >>>>>>
>> >>>>>
>>
>> --
>> --
>> Mauro A Werder
>> Bristol Glaciology Centre
>> School of Geographical Sciences
>> University of Bristol, UK
>>
>> Telephone: +44 (0) 117 3314154
>> E-Mail: [email protected] <javascript:>
>> http://maurow.bitbucket.org
>>
>>
>