Strikerrx01 commented on code in PR #34135:
URL: https://github.com/apache/beam/pull/34135#discussion_r1976635680


##########
sdks/python/apache_beam/io/gcp/bigquery_tools.py:
##########
@@ -786,11 +788,58 @@ def get_table(self, project_id, dataset_id, table_id):
     Raises:
       HttpError: if lookup failed.
     """
+    cache_key = f"{project_id}:{dataset_id}.{table_id}"
+    if cache_key in self._table_cache:
+      _LOGGER.debug("Cache hit for table: %s", cache_key)
+      return self._table_cache[cache_key]
+
+    _LOGGER.debug("Cache miss for table: %s", cache_key)
     request = bigquery.BigqueryTablesGetRequest(
         projectId=project_id, datasetId=dataset_id, tableId=table_id)
     response = self.client.tables.Get(request)
+    
+    # Store the response in cache
+    self._table_cache[cache_key] = response
     return response
 
+  def clear_table_cache(self, project_id=None, dataset_id=None, table_id=None):
+    """Clear the cache for tables.

Review Comment:
   Hi @liferoad, thanks for the review!
   
   The `clear_table_cache` method serves two purposes:
   
   1. **Automatic cache clearing**: With my recent commit, the cache is now 
automatically cleared in the `set_table_definition_ttl` method when changing 
from disabled caching (TTL=0) to enabled caching (TTL>0). This is important 
because if a user had disabled caching and then re-enables it, we want to 
ensure they get fresh data rather than potentially stale entries.
   
   2. **Manual clearing (API)**: It also provides a public API for users to 
manually clear the cache in specific scenarios if needed:
      - Clear the entire cache by calling `wrapper.clear_table_cache()`
      - Clear entries for a specific project with 
`wrapper.clear_table_cache(project_id='my-project')`
      - Clear entries for a specific dataset with 
`wrapper.clear_table_cache(project_id='my-project', dataset_id='my-dataset')`
      - Clear a specific table entry with 
`wrapper.clear_table_cache(project_id='my-project', dataset_id='my-dataset', 
table_id='my-table')`
   
   While automatic clearing in most relevant scenarios is handled within the 
class, exposing this method gives users fine-grained control if they need to 
force a refresh of specific table definitions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to