stark256-spec opened a new pull request, #3454:
URL: https://github.com/apache/iceberg-python/pull/3454

   ## Problem
   
   `list_tables`, `list_views`, and `list_namespaces` in the REST catalog 
eagerly collect every page before returning, even if the caller only needs the 
first few results. In namespaces with thousands of tables this creates 
unnecessary network round-trips and latency before the first result is visible.
   
   Closes #3365
   
   ## Solution
   
   Adds `PaginationList[T]` (`pyiceberg/utils/pagination.py`) — a `list` 
subclass that pre-loads the first page and lazily fetches subsequent pages only 
as the caller iterates past items already in memory.
   
   ### Design
   
   | Operation | Behaviour |
   |-----------|-----------|
   | `for item in result` | Lazy — next page fetched only when iterator 
exhausts current buffer |
   | `result[0]` / `result[2]` | Lazy — fetches pages until the requested index 
is available |
   | `result[-1]` / `result[1:3]` / `len(result)` / `x in result` / `result == 
other` | Eager — fetches all remaining pages |
   | `isinstance(result, list)` | `True` — full backward compatibility |
   
   ### Key properties
   
   - **Zero breaking changes**: `PaginationList` subclasses `list`, so all 
existing call sites that iterate, compare, or extend the return value continue 
to work without modification.
   - **First page always pre-loaded**: Callers that only look at the first few 
items pay zero extra latency compared to the old implementation.
   - **Single fetch per page**: Each page token is consumed at most once; no 
redundant requests.
   
   ## Changes
   
   - `pyiceberg/utils/pagination.py` — new `PaginationList[T]` class
   - `pyiceberg/catalog/rest/__init__.py` — `list_tables`, `list_views`, 
`list_namespaces` refactored to return `PaginationList`
   - `tests/utils/test_pagination.py` — 14 unit tests for all `PaginationList` 
operations
   - `tests/catalog/test_rest.py` — `test_list_tables_returns_pagination_list` 
verifies lazy behaviour (call count stays at 1 while iterating within the first 
page, rises to 2 only after crossing the page boundary)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to