Re: [PR] Added test for count() method and documentation for count() [iceberg-python]

via GitHub Wed, 03 Sep 2025 11:10:21 -0700


gabeiglio commented on code in PR #2423:
URL: https://github.com/apache/iceberg-python/pull/2423#discussion_r2319791508



##########
tests/table/test_count.py:
##########
@@ -0,0 +1,129 @@
+"""
+Unit tests for the DataScan.count() method in PyIceberg.
+
+The count() method is essential for determining the number of rows in an 
Iceberg table
+without having to load the actual data. It works by examining file metadata 
and task
+plans to efficiently calculate row counts across distributed data files.
+
+These tests validate the count functionality across different scenarios:
+1. Basic counting with single file tasks
+2. Empty table handling (zero records)
+3. Large-scale counting with multiple file tasks
+
+The tests use mocking to simulate different table states without requiring 
actual
+Iceberg table infrastructure, ensuring fast and isolated unit tests.
+"""
+
+import pytest
+from unittest.mock import MagicMock, Mock, patch
+from pyiceberg.table import DataScan
+from pyiceberg.expressions import AlwaysTrue
+
+
+class DummyFile:

Review Comment:
   I think we could write real data files and use that for testing wdyt?
   
   Here are some fixtures we could use to get a `FileScanTask` with a file with 
some rows in it: 
[example](https://github.com/apache/iceberg-python/blob/52d810efb62e39ec6d8d6a2f4cd2cad8165e2d2c/tests/conftest.py#L2408)
   
   Maybe we can also add some more fixtures to get FileScanTasks for empty 
files and large ones



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Added test for count() method and documentation for count() [iceberg-python]

Reply via email to