alamb opened a new issue, #158:
URL: https://github.com/apache/parquet-site/issues/158

   This ticket tries to capture the disucsion with @steveloughran, 
@csringhofer, myself and others on 
https://github.com/apache/parquet-site/pull/156#pullrequestreview-3772364529
   
   > It's been pointed out to me that the coverage matrix doesn't cover 
statistics/geometry bounding, without which predicate pushdown doesn't work: 
every rowgroup with the column needs scanning.
   
   The core point as I understand it is that there are several features that 
must be implemented in software libraries to realize the full benefits of the 
new Geometry and Geography types in Parquet. Specifically mentioned were
   
   - Logical type annotation (to know what columns hold Geometry and Geography 
types) <-- this is what the page currently reflects
   - Statistics implementation (e.g. the bounding boxes, and potentially 
different algorithms to compute them)
   - Query engine implementation (e.g. using the bounding box statistics to 
prune parquet files at query time)
   
   There are probably more
   
   ## Suggestions
   One the idea is to add more specific detail on  
https://parquet.apache.org/docs/file-format/implementationstatus/ .
   
   <img width="938" height="81" alt="Image" 
src="https://github.com/user-attachments/assets/947066eb-ed56-4e89-8a81-e30e24989d32";
 />
   
   Perhaps it would be appropriate to add a specific line for the 
geography/geometry statistics, for example
   
   In addition to making the current implementation status more clear, red X's 
on the page seems to have the effect of pressuring additional ecosystem 
adoption.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to