[ 
https://issues.apache.org/jira/browse/BEAM-12865?focusedWorklogId=679844&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-679844
 ]

ASF GitHub Bot logged work on BEAM-12865:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 10/Nov/21 19:14
            Start Date: 10/Nov/21 19:14
    Worklog Time Spent: 10m 
      Work Description: quentin-sommer commented on a change in pull request 
#15489:
URL: https://github.com/apache/beam/pull/15489#discussion_r746911914



##########
File path: sdks/python/apache_beam/io/gcp/bigquery.py
##########
@@ -2158,7 +2169,7 @@ def expand(self, pcoll):
           schema=self.schema,
           create_disposition=self.create_disposition,
           write_disposition=self.write_disposition,
-          triggering_frequency=self.triggering_frequency,
+          triggering_frequency=int(self.triggering_frequency),

Review comment:
       I think it should be an integer. `BigQueryBatchFileLoads` uses it like 
this
   
https://github.com/apache/beam/blob/8da177f64d314cf72e89a51e51fb0915f706a784/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py#L873-L874
   
   [beam 
reference](https://beam.apache.org/releases/pydoc/2.33.0/apache_beam.transforms.trigger.html?highlight=trigger#apache_beam.transforms.trigger.AfterProcessingTime)
 states it's a second delay and I'm not sure what the implementation is doing 
so I'd rather be on the safe side and keep integers.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 679844)
    Time Spent: 7h 10m  (was: 7h)

> Allow customising batch duration when streaming with WriteToBigQuery
> --------------------------------------------------------------------
>
>                 Key: BEAM-12865
>                 URL: https://issues.apache.org/jira/browse/BEAM-12865
>             Project: Beam
>          Issue Type: New Feature
>          Components: io-py-gcp
>    Affects Versions: Not applicable
>            Reporter: Quentin Sommer
>            Priority: P2
>              Labels: stale-P2
>             Fix For: Not applicable
>
>          Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> Hi,
> We allow customising the {{batch_size}} when streaming to BigQuery but the 
> batch duration (used by {{GroupIntoBatches}}) is set to 
> {{DEFAULT_BATCH_BUFFERING_DURATION_LIMIT_SEC}} (0.2)
> I'd like to add the option to specify the {{batch_duration}} to allow better 
> batching for scenarios with little data throughput.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to