Re: [PR] feat: support pagination in `list_*` methods in rest catalog [iceberg-python]

via GitHub Fri, 04 Jul 2025 08:52:05 -0700


jayceslesar commented on code in PR #2158:
URL: https://github.com/apache/iceberg-python/pull/2158#discussion_r2185687161



##########
pyiceberg/catalog/rest/__init__.py:
##########
@@ -584,15 +587,47 @@ def register_table(self, identifier: Union[str, 
Identifier], metadata_location:
         return self._response_to_table(self.identifier_to_tuple(identifier), 
table_response)
 
     @retry(**_RETRY_ARGS)
-    def list_tables(self, namespace: Union[str, Identifier]) -> 
List[Identifier]:
-        namespace_tuple = self._check_valid_namespace_identifier(namespace)
+    def list_tables_raw(
+        self, namespace: Union[str, Identifier], page_size: Optional[int] = 
None, next_page_token: Optional[str] = None
+    ) -> ListTablesResponse:
+        """List Tables, optionally provide page size and next page token.
+
+        Args:
+            namespace (Union[str, Identifier]): Namespace to list against
+            page_size (int): Number of namespaces to return per request.
+            next_page_token (Optional[str]): Token for pagination.
+
+        Returns:
+            ListTablesResponse: List of tables.
+        """
+        namespace_tuple = self.identifier_to_tuple(namespace)
         namespace_concat = NAMESPACE_SEPARATOR.join(namespace_tuple)
-        response = self._session.get(self.url(Endpoints.list_tables, 
namespace=namespace_concat))
+        params: Dict[str, Any] = {}
+        if next_page_token is not None:
+            params["pageToken"] = next_page_token
+        if page_size is not None:
+            params["pageSize"] = page_size
+        response = self._session.get(self.url(Endpoints.list_tables, 
namespace=namespace_concat), params=params)
         try:
             response.raise_for_status()
         except HTTPError as exc:
             _handle_non_200_response(exc, {404: NoSuchNamespaceError})
-        return [(*table.namespace, table.name) for table in 
ListTablesResponse.model_validate_json(response.text).identifiers]
+        return ListTablesResponse.model_validate_json(response.text)
+
+    @retry(**_RETRY_ARGS)
+    def list_tables(self, namespace: Union[str, Identifier], page_size: 
Optional[int] = None) -> List[Identifier]:
+        tables = []
+
+        next_page_token = None
+        while True:
+            list_tables_response = self.list_tables_raw(namespace=namespace, 
page_size=page_size, next_page_token=next_page_token)
+            tables.extend([(*table.namespace, table.name) for table in 
list_tables_response.identifiers])
+            if list_tables_response.next_page_token is None:
+                break
+            else:
+                next_page_token = list_tables_response.next_page_token
+
+        return tables

Review Comment:
   Agreed on switching to iterable, subclassing list seems sort of like a lot 
just to support pagination. I will have a commit that makes all list responses 
from catalogs iterables later...fighting one issue with click and how it 
handles the context but other than that all tests are passing with that swap 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat: support pagination in `list_*` methods in rest catalog [iceberg-python]

Reply via email to