Re: [PROJ] Geodetic to Authalic latitude conversions

Even Rouault via PROJ Thu, 12 Sep 2024 06:18:13 -0700

Jérôme,

feel free to make the code refactorings that make sense. All thisgeodetic <--> authalic conversion is PROJ internals.


Regarding pj_auth2geodlat() signature, I would suggest that it is:

double pj_auth2geodlat(const double * cp, double sinphi, double cosphi)

since looking at the code, at least in cea, eqearth and laea, weactually compute phi by doing a "asin(something)" before


So instead of doing:

phi = asin(something);
phi = pj_auth2geodlat(cp, phi);

we could just do:

phi = pj_auth2geodlat(cp, something, cos(asin(something)))

And that would save us one sin() call

And potentially cos(asin(something)) = sqrt(1 - something^2)

I've tested and that later formula seems to be equivalent in term ofprecision to using cos(asin()), and sometimes marginally better:


For example for values near 0:

>>> '%.17g' % math.cos(math.asin(1e-7))
'0.999999999999995'
>>> '%.17g' % math.sqrt(1 - 1e-7 ** 2)
'0.999999999999995'

At 0.5:

>>> '%.17g' % math.cos(math.asin(.5))
'0.8660254037844386'
>>> '%.17g' % math.sqrt(1 - .5 ** 2)
'0.8660254037844386'

Near 1:

>>> '%.17g' % math.sqrt(1 - (1 - 1e-7) ** 2)
'0.00044721358421085739'
>>> '%.17g' % math.cos(math.asin(1 - 1e-7))
'0.00044721358420187217'

Using https://www.mathsisfun.com/calculator-precision.html, the fullprecision value is:


0.000447213584319617912028634...

so the sqrt() based formula is slightly closer, but clearly not at themaximum precision of a double.

If we wanted to be fully precised, for values close to 1, we couldactually evaluate sqrt(delta * (2 - delta)) with delta = 1 - something


then

>>> '%.17g' % math.sqrt(1e-7 * (2 - 1e-7))
'0.00044721358431961788'

which matches quite perfectly the full precision result.

By experimenting a bit it seems that the appropriate value to use thesqrt(delta * (2 - delta)) formula would be for "something" to be in [1 -0.1, 1]


So I'd suggest a

double pj_cos_of_asin(x)
{
   constexpr double EPSILON_FOR_SQRT = 0.1;
   constexpr double EPSILON_FOR_SLIGHTLY_OUT_OF_DOMAIN = 1e-8;
   if( x >= 1 - EPSILON_FOR_SQRT)
   {
       if( x <= 1 )
       {
           const double delta = 1 - x;
           return sqrt(delta * (2 - delta));
       }
       else if( x <= 1 + EPSILON_FOR_SLIGHTLY_OUT_OF_DOMAIN)
       {
           return 0;
       }
       else
       {
           return std::numeric_limits<double>::quiet_NaN();
       }
    }
    else if( x <= -1 + EPSILON_FOR_SQRT) {
       if( x >= -1 )
       {
           const double delta = x + 1;
           return sqrt(delta * (2 - delta));
       }
       else if( x >= -1 - EPSILON_FOR_SLIGHTLY_OUT_OF_DOMAIN)
       {
           return 0;
       }
       else
       {
           return std::numeric_limits<double>::quiet_NaN();
       }
    }
    else
    {
        return sqrt(1 - x * x);
    }

}

Even

Le 12/09/2024 à 14:09, Charles Karney a écrit :

I recommend against unrolling the loops.  This makes the code longer and
harder to read.  You also lose the flexibility of adjusting the number
of terms in the expansion at runtime.

I'm not an expert of the minutiae of code optimization and my preference
would normally be to keep the code simple and understandable.  But
remember that compilers can do the loop unrolling for you.  Also,
doesn't the smaller code size with the loops result in fewer cache
misses?

Maybe polyval can go into a header file to get inlined.  But remember
that this is needed relatively rarely when setting up the coefficients
for a particular ellipsoid.  So it's probably now contributing much to
the overall CPU cost.

It is possible to do 2x unroll of the Clenshaw loop to avoid the
shuffling of variables (t = xx(u0, u1), u1 = u0, u0 = t).  See the
function SinCosSeries in geodesic.c where this is done.

Undoubtedly, we could do a better job centralizing some of these core
capabilities, Clenshaw (and its complex counterpart) + general auxiliary
latitude conversions, so that we don't have essentially duplicate code
scattered all over the place.

On Wed, Sep 11, 2024 at 10:44 PM Jérôme St-Louis<jer...@ecere.com>  wrote:

Dear Even, Charles, Thomas, All,

Please find below a couple revised implementations of the authalic ==> geodetic 
conversion using Horner's method and Clenshaw summation algorithm, both sharing 
this table of coefficients from A20:

#define AUTH_ORDER 6

static const double Cphimu[21] = // Cφξ (A20) - coefficients to convert 
authalic latitude to geodetic latitude
{
    4 / 3.0,  4 / 45.0,   -16/35.0,  -2582 /14175.0,  60136 /467775.0,    
28112932/ 212837625.0,
             46 / 45.0,  152/945.0, -11966 /14175.0, -21016 / 51975.0,   
251310128/ 638512875.0,
                       3044/2835.0,   3802 /14175.0, -94388 / 66825.0,    
-8797648/  10945935.0,
                                      6059 / 4725.0,  41072 / 93555.0, 
-1472637812/ 638512875.0,
                                                     768272 /467775.0,  
-455935736/ 638512875.0,
                                                                        
4210684958/1915538625.0
};

This first one is using the existing functions from mlfn.cpp (untouched other 
than possibly different formatting here):

// Evaluate sum(p[i] * x^i, i, 0, N) via Horner's method (p is of length N+1)
static inline double polyval(double x, const double p[], int N)
{
    double y = N < 0 ? 0 : p[N];
    while(N > 0)
       y = y * x + p[--N];
    return y;
}

// Evaluate y = sum(c[k] * sin((2*k+2) * zeta), k, 0, K-1)
static inline double clenshaw(double szeta, double czeta, const double c[], int 
K)
{
    // Approx operation count = (K + 5) mult and (2 * K + 2) add
    double u0 = 0, u1 = 0; // accumulators for sum
    double X = 2 * (czeta - szeta) * (czeta + szeta); // 2 * cos(2*zeta)
    while(K > 0)
    {
       double t = X * u0 - u1 + c[--K];
       u1 = u0;
       u0 = t;
    }
    return 2 * szeta * czeta * u0; // sin(2*zeta) * u0
}

//https://arxiv.org/pdf/2212.05818
// ∆η(ζ) = S^(L)(ζ) · Cηζ · P^(M) (n) + O(n^L+1)    -- (20)
void pj_authset(double a, double b, double cp[AUTH_ORDER])
{
    double n = (a - b) / (a + b);  // Third flattening
    double d = n;
    int l, o;

    for(l = 0, o = 0; l < AUTH_ORDER; l++)
    {
       int m = AUTH_ORDER - l - 1;

       cp[l] = d * polyval(n, Cphimu + o, m);
       d *= n;
       o += m + 1;
    }
}

double pj_auth2geodlat(const double * cp, double phi)
{
    return phi + clenshaw(sin(phi), cos(phi), cp, AUTH_ORDER);
}

For this second implementation, I unrolled the loops to get rid of the 
iterations (and associated counter incrementations) and conditionals, which if 
the compiler is not optimizing out, could potentially introduce some branching 
costs.
This unrolled version remains quite compact (at least in this particular 
formatting which the pre-commit hook will certainly massacre). The sequence of 
operations is exactly the same, and I've tested that the two are equivalent, 
and also equivalent with the two earlier implementations that I shared which 
were not using Horner and Clenshaw, and also equivalent to 8 decimals to the 
existing pj_authlat() function in PROJ.

//https://arxiv.org/pdf/2212.05818
// ∆η(ζ) = S^(L)(ζ) · Cηζ · P^(M) (n) + O(n^L+1)    -- (20)
void pj_authset(double a, double b, double cp[AUTH_ORDER])
{
    // Precomputing coefficient based on Horner's method
    double n = (a - b) / (a + b);  // Third flattening
    const double * C = Cphimu;
    double d = n;

    cp[0] = (((((C[ 5] * n + C[ 4]) * n + C[ 3]) * n + C[ 2]) * n + C[ 1]) * n 
+ C[ 0]) * d, d *= n;
    cp[1] = ((((             C[10]  * n + C[ 9]) * n + C[ 8]) * n + C[ 7]) * n 
+ C[ 6]) * d, d *= n;
    cp[2] = (((                           C[14]  * n + C[13]) * n + C[12]) * n 
+ C[11]) * d, d *= n;
    cp[3] = ((                                         C[17]  * n + C[16]) * n 
+ C[15]) * d, d *= n;
    cp[4] = (                                                       C[19]  * n 
+ C[18]) * d, d *= n;
    cp[5] =                                                                     
 C[20]  * d;
}

double pj_auth2geodlat(const double * cp, double phi)
{
    // Using Clenshaw summation algorithm (order 6)
    double szeta = sin(phi), czeta = cos(phi);
    // Approx operation count = (K + 5) mult and (2 * K + 2) add
    double X = 2 * (czeta - szeta) * (czeta + szeta); // 2 * cos(2*zeta)
    double u0 = 0, u1 = 0; // accumulators for sum
    double t;
    t = X * u0 - u1 + cp[5], u1 = u0, u0 = t;
    t = X * u0 - u1 + cp[4], u1 = u0, u0 = t;
    t = X * u0 - u1 + cp[3], u1 = u0, u0 = t;
    t = X * u0 - u1 + cp[2], u1 = u0, u0 = t;
    t = X * u0 - u1 + cp[1], u1 = u0, u0 = t;
    t = X * u0 - u1 + cp[0];
    return phi + /* sin(2*zeta) * u0 */ 2 * szeta * czeta * t;
}

Note that the output of these two versions of pj_authset() (the 6 constants 
precomputed from the authalic ==> geodetic A20 conversion matrix and the 
ellipsoid's third flattening) is exactly the same as the previous version not 
using Horner's method, and I believe also the same as the current output of 
pj_autset() except that it currently uses only 3 constants for order 3 rather than 
6 for order 6.

With both of these versions, we're down to only one sin() and one cos() call, 
as per Even's suggestion, so I imagine that the Clenshaw algorithm does take 
advantage of that trigonometric identity trick.

If we go with the separate polyval() and clenshaw() functions, then I suggest 
we move these functions to a header file so that we can share them between 
mlfn.cpp and auth.cpp while allowing the compiler to hopefully efficiently 
inline them, and also hopefully optimize the code close to or equivalent to the 
unrolled version (we could always compare the disassembly to verify whether 
this is the case or not, but I would leave that to others).

My own preference would be for the unrolled version.

We could also make C / Cphimu a parameter to pj_authset() (which could be named 
something else), since this could be used for other conversions between 
auxiliary latitudes.
Similarly, pj_auth2geodlat() could actually be used for different conversions 
if passing it pre-computed coefficients for other conversions, so perhaps it 
could have a more generic names.
The rectifying latitude for pj_enfn() is a bit special because it uses n^2 
rather than n, which tripped me up for a little while.

Thoughts / suggestions on how to move forward with this?

As a next step I would prepare a Pull Request based on your feedback, if you 
have a preference for the shared functions or the unrolled loops approach.

Thank you very much for your help and guidance!

Kind regards,

-Jerome

On 9/11/24 4:21 PM, Jérôme St-Louis wrote:

So it seems like we already have an implementation of Horner and Clenshaw in:

     https://github.com/OSGeo/PROJ/blob/master/src/mlfn.cpp

called polyval() and clenshaw() just like in GeographicLib ( polyval() , 
Clenshaw()).

It seems like Charles wrote or at least updated that :)

That is using the Cµφ (C[mu phi]) (A5) and Cφµ (C[phi mu]) (A6) from page  12 of the 
paper, where µ is called the "rectifying latitude".
I imagine that this is directly related to the "meridional distance" ?

Perhaps we could re-organize this a bit to share this polyval() and clenshaw() 
(they are currently static functions local to this mlfn.cpp) for use in 
auth.cpp ?

Thanks!

Kind regards,

-Jerome

On 9/11/24 3:33 PM, Jérôme St-Louis wrote:

Thanks a lot for the input Charles and Thomas,

I am not familiar with either Horner or Clenshaw, but I do see the mentions now 
on Section 6 - Evaluating the series pages 6 and 7 of the papers.
I implemented the simpler basic approach from section 3 / page 3, which also 
happened to more easily correspond to the existing PROJ implementation.

I can definitely try to understand all this, with the help of this Rust Geodesy 
code and the GeographicLib code, and have a go at updating my proposed 
implementation for improved accuracy and performance.

Kind regards,

-Jerome

On 9/11/24 12:18 PM, Thomas Knudsen wrote:

I totally agree with Charles regarding using Horner for polynomial
evaluation and Clenshaw for the trig series - for accuracy and speed.

I implemented all the material from Charles' preprint
https://arxiv.org/pdf/2212.05818  for Rust Geodesy, when the preprint
appeared about 1½ years ago.

And although (being an experiment) my handling of the raw coefficients
is rather clumsy, at least it gave me a reason to revise my PROJ horner
and clenshaw implementations (which in turn were based on material from
Poder & Engsager: "Some Conformal Mappings...").

So Jérôme, perhaps take a look at the functions "taylor" and "fourier"
over athttps://github.com/busstoptaktik/geodesy/blob/main/src/math/series.rs

While written in Rust, translating to C++ should be rather trivial,
and they may be easier to follow than my decade-old versions already
in the PROJ code base.

--
http://www.spatialys.com
My software is free, but my time generally not.

_______________________________________________
PROJ mailing list
PROJ@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/proj

Re: [PROJ] Geodetic to Authalic latitude conversions

Reply via email to