Should have mentioned, to make these examples work, you need to start off
your session by doing adding DualNumbers and PowerSeries if you don't have
them already, and "using them":
julia> Pkg.add("DualNumbers")
julia> Pkg.add("PowerSeries")
julia> using DualNumbers
julia> using PowerSeries
On Wednesday, January 22, 2014 12:22:37 AM UTC-8, Jason Merrill wrote:
>
> The examples you mention are definitely interesting ones. Thanks for
> mentioning them.
>
> On Tuesday, January 21, 2014 4:05:15 AM UTC-8, Hans W Borchers wrote:
>>
>> Think of a special function not (yet) implemented in Julia, like Lambert
>> W, or Struve, or ... So you define it yourself, e.g. by applying
>> Newton-Raphson. How would you symbolically differentiate that?
>>
>
> Here's Newton's method for the Lambert W function:
>
> julia> function lambert_newton(x)
> y = zero(x)
> for i = 1:25
> y = y - (y*exp(y)-x)/(exp(y)*(1+y))
> end
> return y
> end
>
> julia> lambert_newton(1.0*e)
> 1.0
>
> You can differentiate this function to full precision by passing a Dual as
> an argument:
>
> julia> lambert_newton(Dual(e, 1.0))
> 1.0 + 0.18393972058572117du
>
> Compare to the exact value [1]
>
> julia> 1/2e
> 0.18393972058572117
>
> You can take a second derivative using (the master branch of [2])
> PowerSeries by passing a 2nd order power series as an argument, and using
> polydir twice
>
> julia> polyder(polyder(lambert_newton(series(e, 1.0, 0.0))))
> -0.05075073121372976
>
> Compare to the exact value [3]
>
> julia> -3/8e^2
> -0.050750731213729756
>
> In this case, we're off by one bit.
>
> Or your application forces you to define a procedure with if-else
>> constructs and loops, how to differentiate? Even an advanced "automated
>> differentiation" approach will most often not be capable of solving this
>> for you.
>>
>> Or imagine having to minimize a function such as
>>
>> function repelling_balls(x::Array{Float64, 1})
>> if length(x) != 45; error("Arg. 'x' must be a vector of length 45.");
>> end
>> d = 2.0
>> for i = 1:14, j = (i+1):15
>> s = sqrt((x[i]-x[j])^2 + (x[i+15]-x[j+15])^2 + (x[i+30]-x[j+30])^2)
>> if s < d; d = s; end
>> end
>> return -d
>> end
>>
>> that models 15 repelling balls/points in [0, 1]^3.
>>
>
> As for your second example, you can extract the gradient vector element by
> element by lifting one coordinate at a time to a dual number, i.e. replace
> x[i] by Dual(x[i], 1.0), and then examining the du part of the result. You
> just need to drop the type annotation in your definition.
>
> julia> x = rand(45) + Dual(0.0, 0.0)
> julia> grad = zeros(Float64, 45)
> julia> for i = 1:45
> xp = x
> xp[i] += Dual(0.0, 1.0)
> r = repelling_balls(xp)
> grad[i] = r.du
> end
> julia> grad
>
> The components of grad are now probably accurate to within 1 bit. This
> example works out a little bit funny because the box size is small compared
> to the ball size, but you can adjust the parameters to fix it. In
> principle, you could extract the hessian by taking directional second
> derivatives with PowerSeries, but I haven't fully worked that out yet. That
> could be interesting to know the stiffness about the equilibrium point once
> you've found it.
>
>
>> As "minimax" problem this function is smooth except for a subset of lower
>> dimension. Some optimization procedures using gradients will be able to
>> circumvent this subset and find a (local) minimum. How will you
>> differentiate that (i.e., build a gradient) if not numerically; and if yes,
>> is it worth the effort?
>>
>
> In both of these examples, you pay nothing in effort or performance
> compared to finite difference differentiation, and you avoid losing half
> your precision. I think that's very exciting!
>
>
>> Consider that an optimization or root finding procedure is a (hopefully)
>> converging process that will not return an absolutely precise result. A
>> function itself may be defined through an optimization or root finding task
>> and whose derivative will have a technical meaning.
>>
>> In real-world applications, and tasks Matlab is made for, you do not care
>> for 15 digits, and gradients five, six, seven digits accurate are more than
>> enough to solve your problem. Normally, your functions are not defined for
>> eps() accuracy anyway.
>>
>
> I should not have downplayed the significance or utility of finite
> difference differentiation. It is obviously tremendously important, and is
> suitable for many applications. Julia should strive to implement the best
> versions possible. Currently, different packages are at different points of
> their development.
>
> That said, it seems like you're trying to have it both ways here, in the
> sense that you're upset that Calculus throws away 2-3 digits compared to a
> better finite difference method, but at the same time, you're downplaying
> the significance of getting the other 7 digits to get to full precision.
>
> In a specific application, it's often obvious how many digits are
> relevant, and that number is definitely often less than 15. But as package
> authors, we should strive to compute to full precision when that's possible
> without undue complexity or extreme performance tradeoffs. The problem is
> that if packages A, B, and C all decide that 8 digits is good enough for
> anyone and so throw away half the precision of floating point arguments,
> and I decide to take the output of package A as input to package B, and the
> output of B as input to C, suddenly I don't have 8 digits, I have 2. I'm
> sure you probably understand this point already, but it's something I
> didn't completely appreciate until recently.
>
>
>> When I came here, I thought Julia was meant for "technical computing" and
>> not so much for pure Math exercises. In scientific computing, calculating
>> numerical derivatives and gradients is essential for many applications.
>>
>>
> I hope it will be for both. There's a lot of amazing people working on it.
>
> [1] First derivative of Lambert W:
> http://www.wolframalpha.com/input/?i=LambertW'[e]
> [2] The release of PowerSeries that makes this work is this PR:
> https://github.com/JuliaLang/METADATA.jl/pull/535
> [3] Second derivative of Lambert W:
> http://www.wolframalpha.com/input/?i=LambertW''[e]
>
>
>> On Tuesday, January 21, 2014 3:28:28 AM UTC+1, Jason Merrill wrote:
>>>
>>> Would you be willing to share a problem that you're interested in that
>>> isn't amenable to symbolic differentiation, maybe on a gist or something?
>>> I've been looking for non-trivial problems to try to apply PowerSeries to.
>>>
>>> When I said numerical differentiation is "unstable", I meant something
>>> like "hard to do without losing several digits". For example, you point out
>>> that by choosing a different h, it's possible to improve over Calculus.jl's
>>> current behavior by 2-3 decimal digits, from 7e-7 to 2e-9 in absolute
>>> error. But the best precision you could hope for in a floating point answer
>>> to this problem is
>>>
>>> julia> eps(sin(1.0))
>>> 1.1102230246251565e-16
>>>
>>> so even with your improvements, you've lost a lot of precision. That
>>> might or might not be a problem depending on what you'd like to do next
>>> with the value you get out.
>>>
>>>
>>> On Monday, January 20, 2014 12:30:38 PM UTC-8, Hans W Borchers wrote:
>>>>
>>>> Numerical differentiation is by far not as unstable as you seem to
>>>> think.
>>>> And I have a long experience in using numerical derivatives for
>>>> optimization problems where you don't stop to look up symbolic derivatives
>>>> applying a CAS.
>>>> The function obviously was only an example.
>>>> For most of the functions I have used Julia's symbolic capabilities
>>>> will not be sufficient.
>>>>
>>>>
>>>> On Monday, January 20, 2014 9:07:02 PM UTC+1, Jason Merrill wrote:
>>>>>
>>>>> This implementation could certainly use some love, but finite
>>>>> difference differentiation is always unstable, and the situation gets
>>>>> worse
>>>>> as you take higher order derivatives.
>>>>>
>>>>> You might want to consider using the differentiate method to take your
>>>>> derivatives symbolically (if this works for the functions you're using),
>>>>> or
>>>>> have a look at the differentiation example in PowerSeries.jl. To do a
>>>>> symbolic second derivative, you can do, e.g.
>>>>>
>>>>> julia> using Calculus
>>>>> julia> @eval d2(x) = $(differentiate(differentiate(:(sin(x)))))
>>>>> julia> d2(1.0)
>>>>> -0.8414709848078965
>>>>>
>>>>