paleolimbot commented on PR #2971:
URL: https://github.com/apache/parquet-java/pull/2971#issuecomment-2778945139

   >  This is why I proposed to use NaNs for the case of entire column of empty 
geometries.
   
   This works for me, although I'd like a +1 from @wgtmac before changing the 
C++ implementation + test files! This is also consistent with what R and numpy 
will give you if you try to take the `max()` of an empty range.
   
   > In this case any NaN coordinates in the input will surface as NaNs in the 
resulting box. 
   
   This is only true for JTS (GEOS just ignores NANs when computing a min/max 
for a dimension, lwgeom/PostGIS restarts interval computation after it sees an 
nan). I think your strategy is a good one for JTS, but I also think it's OK to 
do anything that won't result in accidentally excluding the entire row roup 
(i.e., a writer MAY choose to either include or exclude finite coordinates from 
geometries that contain nan values when writing statistics, or non-points that 
contain NaN values have undefined behaviour but shouldn't affect valid 
geometries in the same row group).
   
   > So for now I would assume that we do not want to fail in such cases, but 
rather compute the statistics in a safe manner.
   
   Yes, I think this is best for geometries with NaNs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to