jiayuasu opened a new pull request, #2673:
URL: https://github.com/apache/sedona/pull/2673

   ## Summary
   
   Fix `toEpsgCode()` to work correctly with PROJJSON input and add 
`toAuthority()` for non-EPSG CRS identification.
   
   When a CRS is constructed from PROJJSON (e.g., from pyproj output), 
`toEpsgCode()` previously failed to identify well-known codes because:
   1. The `id` field's authority/code was not preserved during parsing
   2. Datum names from PROJJSON were not stored or used for identification
   3. Projection method names like `"Transverse Mercator"` were not normalized 
to PROJ short names (`"tmerc"`)
   4. Ellipsoid-only comparison could not distinguish NAD83 (GRS 1980) from WGS 
84
   
   ## Changes
   
   ### New API: `toAuthority()`
   - `Proj.toAuthority()` / `CRSSerializer.toAuthority()` returns 
`{"authority", "code"}` (e.g., `{"EPSG", "4326"}` or `{"IAU", "49900"}`)
   - Similar to pyproj's `CRS.to_authority()` — supports non-EPSG authorities
   - `toEpsgCode()` now delegates to `toAuthority()` and filters for EPSG-only 
results
   
   ### Three-phase CRS identification (mirrors pyproj)
   1. **srsCode** — if the PROJJSON `id` field was parsed, return it directly
   2. **Datum name** — look up well-known datum names (restricted to geographic 
CRS)
   3. **Parameter matching** — compare projection parameters against known EPSG 
definitions
   
   ### Parser fixes (ProjJsonTransformer)
   - Store `datum.name` / `datum_ensemble.name` in `datumCode` field
   - Store `id.authority:id.code` in `srsCode` field (with Gson Double→int 
handling)
   - Preserve `srsCode` on `Proj` construction from `"EPSG:4326"` strings
   
   ### Robust matching (CRSSerializer)
   - `WKT_TO_PROJ_METHOD` — reverse map with all 43 EPSG canonical PROJJSON 
method names
   - `DATUM_NAME_TO_EPSG` — ~27 common datums sourced from the PROJ database 
(EPSG registry)
   - `datumCodesMatch()` — normalizes both sides to EPSG codes for comparison; 
rejects unknown datums (matching pyproj behavior)
   - `matchesDefinition()` — now checks inverse flattening (`rf`) to 
distinguish GRS 1980 from WGS 84
   
   ## pyproj behavior comparison
   
   The following table shows behavior alignment with pyproj 3.7.2 (PROJ 9.6.0):
   
   | Input | pyproj `to_epsg()` | proj4sedona `toEpsgCode()` (before) | 
proj4sedona `toEpsgCode()` (after) |
   |---|---|---|---|
   | PROJJSON WGS 84 with `id.code=4326` | 4326 | null | EPSG:4326 |
   | PROJJSON NAD83(2011) with `id.code=6318` | 6318 | null | EPSG:6318 |
   | PROJJSON UTM 32N with `id.code=32632` | 32632 | null | EPSG:32632 |
   | PROJJSON IAU:49900 | None (not EPSG) | null | null |
   | PROJJSON NAD83(2011) **no id**, datum name present | 6318 | null | 
EPSG:6318 |
   | PROJJSON unknown datum + GRS 1980, **no id** | None (confidence 60%) | 
EPSG:4269 (false positive) | null |
   | PROJJSON unknown datum + WGS 84, **no id** | None (confidence 60%) | 
EPSG:4326 (false positive) | null |
   | `"EPSG:4326"` string | 4326 | EPSG:4326 | EPSG:4326 |
   | `"+proj=longlat +datum=WGS84"` | 4326 | EPSG:4326 | EPSG:4326 |
   | `"EPSG:3857"` string | 3857 | EPSG:3857 | EPSG:3857 |
   
   ## Performance
   
   Microbenchmark comparing old vs new `matchesDefinition()` (10,000 
iterations, Apple M1 Max):
   
   | Scenario | Old (equalsIgnoreCase) | New (datumCodesMatch) | Delta |
   |---|---|---|---|
   | WGS 84 match | 0.4 µs | 0.4 µs | noise |
   | NAD83 match | 0.4 µs | 0.4 µs | noise |
   | Unknown datum reject | 0.3 µs | 0.3 µs | noise |
   
   The datum normalization lookup adds negligible overhead because it's 
dominated by the `new Proj(code)` call inside `matchesDefinition()`. Full test 
suite time is unchanged (~10s).
   
   ## Tests
   
   - 13 new tests covering PROJJSON identification, `toAuthority()`, datum name 
lookup, unknown datum rejection, IAU authority, and round-trips
   - All 661 tests passing
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to