[jira] [Work logged] (BEAM-11359) Clean up temporary dataset after ReadAllFromBQ executes

ASF GitHub Bot (Jira) Thu, 10 Jun 2021 16:25:06 -0700


     [ 
https://issues.apache.org/jira/browse/BEAM-11359?focusedWorklogId=609992&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-609992
 ]


ASF GitHub Bot logged work on BEAM-11359:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 10/Jun/21 23:24
            Start Date: 10/Jun/21 23:24
    Worklog Time Spent: 10m 
      Work Description: pabloem commented on a change in pull request #14745:
URL: https://github.com/apache/beam/pull/14745#discussion_r649593016



##########
File path: sdks/python/apache_beam/io/gcp/bigquery_read_internal.py
##########
@@ -183,15 +183,16 @@ def process(self,
               element: 'ReadFromBigQueryRequest') -> Iterable[BoundedSource]:
     bq = bigquery_tools.BigQueryWrapper(
         temp_dataset_id=self._get_temp_dataset().datasetId)
-    # TODO(BEAM-11359): Clean up temp dataset at pipeline completion.
 
     if element.query is not None:
       self._setup_temporary_dataset(bq, element)
       table_reference = self._execute_query(bq, element)
+      created_temp_dataset = True

Review comment:
       I think this is not enough to be sure of whether we created the dataset. 
You may need to change `_setup_temporary_dataset`, and this: 
https://github.com/apache/beam/blob/2aed67b1fbacce923e22347400251c34a1f6ab2c/sdks/python/apache_beam/io/gcp/bigquery_tools.py#L788-L814
   
   to return something to the caller depending on whether the dataset was 
created or not.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 609992)
    Time Spent: 4h 50m  (was: 4h 40m)

> Clean up temporary dataset after ReadAllFromBQ executes
> -------------------------------------------------------
>
>                 Key: BEAM-11359
>                 URL: https://issues.apache.org/jira/browse/BEAM-11359
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-py-gcp
>            Reporter: Pablo Estrada
>            Priority: P3
>          Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Currently, the transform creates (or receives) a temp dataset and it does not 
> clean it up. Only one is created per pipeline, so it's not too bad, but it's 
> not ideal.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-11359) Clean up temporary dataset after ReadAllFromBQ executes

Reply via email to