Re: [PR] [GH-2004] Geopandas.GeoSeries: Implement Test Framework [sedona]

via GitHub Wed, 25 Jun 2025 10:17:47 -0700


zhangfengcdt commented on PR #2005:
URL: https://github.com/apache/sedona/pull/2005#issuecomment-3005566360


   > > Looks great! I think for the first step, we could target covering 
whatever these scala tests cover in exact results.
   > 
   > That made perfect sense originally, but now I see why it wasn't done in 
Scala. These test files we're using (`self.mixedWktGeometryInputLocation`) have 
100 entries. Hard coding 100 polygons for a single test (e.g for ST_buffer) 
would make reading and navigating the test file a real pain.
   > 
   > Another thought. Originally, I had a test checking if the sedona sql 
function's result matched our new sgpd result. However, we agreed to remove it 
because it seemed trivial that of course the results would match. Following 
that same logic, if we can assume the new sedona geopandas results match the 
sedona sql results, then why do we need to replicate the Scala tests again if 
we already know sedona sql passes it? Or maybe we should just not make that 
assumption (in that case maybe it is worth bringing back that old sgpd vs 
sedona test). Personally, I'm fine with making the assumption. If anything, 
maybe we should hard-code results for the `test_match_geopandas_series.py` 
tests since those results are smaller. WDYT @zhangfengcdt?
   
   I think the reason why we still need the hard-coded tests are that they are 
not simply replicating the scala expression tests, they should extend the scala 
cases. Basically, the goals are:
   - Verify the SQL created to convert geopandas -> sedona queries are expected;
   - Verify manually the results for matching or differences (reason) between 
geopandas origin and geopandas on Sedona.
   
   This means we will very possibly cover more cases than the scala expression 
tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [GH-2004] Geopandas.GeoSeries: Implement Test Framework [sedona]

Reply via email to