kou commented on code in PR #47183:
URL: https://github.com/apache/arrow/pull/47183#discussion_r2239583273


##########
python/pyarrow/tests/parquet/test_parquet_file.py:
##########
@@ -348,6 +348,7 @@ def test_read_statistics():
     statistics = pq.ParquetFile(buf).read().columns[0].chunks[0].statistics
     assert statistics.null_count == 1
     assert statistics.distinct_count is None
+    assert statistics.is_distinct_count_exact is False

Review Comment:
   Good point.
   
   It's difficult for now... We don't provide a custom 
`ArrayStatistics.distinct_count` API for now. We need to build 
`ArrayStatistics.distinct_count` from statistics in Apache Parquet. But the 
current Apache Parquet writer doesn't support `distinct_count` yet.
   
   So we don't have `is_distinct_count_exact` == `None`/`True` code paths for 
now...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to