Should have mentioned, to make these examples work, you need to start off 
your session by doing adding DualNumbers and PowerSeries if you don't have 
them already, and "using them":

  julia> Pkg.add("DualNumbers")
  julia> Pkg.add("PowerSeries")
  julia> using DualNumbers
  julia> using PowerSeries

On Wednesday, January 22, 2014 12:22:37 AM UTC-8, Jason Merrill wrote:
>
> The examples you mention are definitely interesting ones. Thanks for 
> mentioning them.
>
> On Tuesday, January 21, 2014 4:05:15 AM UTC-8, Hans W Borchers wrote:
>>
>> Think of a special function not (yet) implemented in Julia, like Lambert 
>> W, or Struve, or ... So you define it yourself, e.g. by applying 
>> Newton-Raphson. How would you symbolically differentiate that?
>>
>
> Here's Newton's method for the Lambert W function:
>
>   julia> function lambert_newton(x)
>         y = zero(x)
>         for i = 1:25
>           y = y - (y*exp(y)-x)/(exp(y)*(1+y))
>         end
>         return y
>        end
>
>   julia> lambert_newton(1.0*e)
>   1.0
>
> You can differentiate this function to full precision by passing a Dual as 
> an argument:
>
>   julia> lambert_newton(Dual(e, 1.0))
>   1.0 + 0.18393972058572117du
>
> Compare to the exact value [1]
>   
>   julia> 1/2e
>   0.18393972058572117
>
> You can take a second derivative using (the master branch of [2]) 
> PowerSeries by passing a 2nd order power series as an argument, and using 
> polydir twice
>
>   julia> polyder(polyder(lambert_newton(series(e, 1.0, 0.0))))
>   -0.05075073121372976
>
> Compare to the exact value [3]
>
>   julia> -3/8e^2
>   -0.050750731213729756
>
> In this case, we're off by one bit.
>
> Or your application forces you to define a procedure with if-else 
>> constructs and loops, how to differentiate? Even an advanced "automated 
>> differentiation" approach will most often not be capable of solving this 
>> for you.
>>
>> Or imagine having to minimize a function such as
>>
>> function repelling_balls(x::Array{Float64, 1})
>>   if length(x) != 45; error("Arg. 'x' must be a vector of length 45."); 
>> end
>>   d = 2.0
>>   for i = 1:14, j = (i+1):15
>>       s = sqrt((x[i]-x[j])^2 + (x[i+15]-x[j+15])^2 + (x[i+30]-x[j+30])^2)
>>       if s < d; d = s; end
>>   end
>>   return -d
>> end
>>
>> that models 15 repelling balls/points in [0, 1]^3.
>>
>
> As for your second example, you can extract the gradient vector element by 
> element by lifting one coordinate at a time to a dual number, i.e. replace 
> x[i] by Dual(x[i], 1.0), and then examining the du part of the result. You 
> just need to drop the type annotation in your definition.
>
>   julia> x = rand(45) + Dual(0.0, 0.0)
>   julia> grad = zeros(Float64, 45)
>   julia> for i = 1:45
>              xp = x
>              xp[i] += Dual(0.0, 1.0)
>              r = repelling_balls(xp)
>              grad[i] = r.du
>            end
>   julia> grad
>
> The components of grad are now probably accurate to within 1 bit. This 
> example works out a little bit funny because the box size is small compared 
> to the ball size, but you can adjust the parameters to fix it. In 
> principle, you could extract the hessian by taking directional second 
> derivatives with PowerSeries, but I haven't fully worked that out yet. That 
> could be interesting to know the stiffness about the equilibrium point once 
> you've found it. 
>  
>
>> As "minimax" problem this function is smooth except for a subset of lower 
>> dimension. Some optimization procedures using gradients will be able to 
>> circumvent this subset and find a (local) minimum. How will you 
>> differentiate that (i.e., build a gradient) if not numerically; and if yes, 
>> is it worth the effort?
>>
>
> In both of these examples, you pay nothing in effort or performance 
> compared to finite difference differentiation, and you avoid losing half 
> your precision. I think that's very exciting!
>  
>
>> Consider that an optimization or root finding procedure is a (hopefully) 
>> converging process that will not return an absolutely precise result. A 
>> function itself may be defined through an optimization or root finding task 
>> and whose derivative will have a technical meaning.
>>
>> In real-world applications, and tasks Matlab is made for, you do not care 
>> for 15 digits, and gradients five, six, seven digits accurate are more than 
>> enough to solve your problem. Normally, your functions are not defined for 
>> eps() accuracy anyway.
>>
>
> I should not have downplayed the significance or utility of finite 
> difference differentiation. It is obviously tremendously important, and is 
> suitable for many applications. Julia should strive to implement the best 
> versions possible. Currently, different packages are at different points of 
> their development.
>
> That said, it seems like you're trying to have it both ways here, in the 
> sense that you're upset that Calculus throws away 2-3 digits compared to a 
> better finite difference method, but at the same time, you're downplaying 
> the significance of getting the other 7 digits to get to full precision.
>
> In a specific application, it's often obvious how many digits are 
> relevant, and that number is definitely often less than 15. But as package 
> authors, we should strive to compute to full precision when that's possible 
> without undue complexity or extreme performance tradeoffs. The problem is 
> that if packages A, B, and C all decide that 8 digits is good enough for 
> anyone and so throw away half the precision of floating point arguments, 
> and I decide to take the output of package A as input to package B, and the 
> output of B as input to C, suddenly I don't have 8 digits, I have 2. I'm 
> sure you probably understand this point already, but it's something I 
> didn't completely appreciate until recently.
>  
>
>> When I came here, I thought Julia was meant for "technical computing" and 
>> not so much for pure Math exercises. In scientific computing, calculating 
>> numerical derivatives and gradients is essential for many applications.
>>
>>
> I hope it will be for both. There's a lot of amazing people working on it.
>
> [1] First derivative of Lambert W: 
> http://www.wolframalpha.com/input/?i=LambertW'[e]
> [2] The release of PowerSeries that makes this work is this PR: 
> https://github.com/JuliaLang/METADATA.jl/pull/535
> [3] Second derivative of Lambert W: 
> http://www.wolframalpha.com/input/?i=LambertW''[e]
>  
>
>> On Tuesday, January 21, 2014 3:28:28 AM UTC+1, Jason Merrill wrote:
>>>
>>> Would you be willing to share a problem that you're interested in that 
>>> isn't amenable to symbolic differentiation, maybe on a gist or something? 
>>> I've been looking for non-trivial problems to try to apply PowerSeries to.
>>>
>>> When I said numerical differentiation is "unstable", I meant something 
>>> like "hard to do without losing several digits". For example, you point out 
>>> that by choosing a different h, it's possible to improve over Calculus.jl's 
>>> current behavior by 2-3 decimal digits, from 7e-7 to 2e-9 in absolute 
>>> error. But the best precision you could hope for in a floating point answer 
>>> to this problem is
>>>
>>> julia> eps(sin(1.0))
>>> 1.1102230246251565e-16
>>>
>>> so even with your improvements, you've lost a lot of precision. That 
>>> might or might not be a problem depending on what you'd like to do next 
>>> with the value you get out.
>>>
>>>
>>> On Monday, January 20, 2014 12:30:38 PM UTC-8, Hans W Borchers wrote:
>>>>
>>>> Numerical differentiation is by far not as unstable as you seem to 
>>>> think.
>>>> And I have a long experience in using numerical derivatives for 
>>>> optimization problems where you don't stop to look up symbolic derivatives 
>>>> applying a CAS.
>>>> The function obviously was only an example.
>>>> For most of the functions I have used Julia's symbolic capabilities 
>>>> will not be sufficient.
>>>>
>>>>
>>>> On Monday, January 20, 2014 9:07:02 PM UTC+1, Jason Merrill wrote:
>>>>>
>>>>> This implementation could certainly use some love, but finite 
>>>>> difference differentiation is always unstable, and the situation gets 
>>>>> worse 
>>>>> as you take higher order derivatives.
>>>>>
>>>>> You might want to consider using the differentiate method to take your 
>>>>> derivatives symbolically (if this works for the functions you're using), 
>>>>> or 
>>>>> have a look at the differentiation example in PowerSeries.jl. To do a 
>>>>> symbolic second derivative, you can do, e.g.
>>>>>
>>>>> julia> using Calculus
>>>>> julia> @eval d2(x) = $(differentiate(differentiate(:(sin(x)))))
>>>>> julia> d2(1.0)
>>>>> -0.8414709848078965
>>>>>
>>>>

Reply via email to