[
https://issues.apache.org/jira/browse/BEAM-14388?focusedWorklogId=765536&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-765536
]
ASF GitHub Bot logged work on BEAM-14388:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 03/May/22 16:52
Start Date: 03/May/22 16:52
Worklog Time Spent: 10m
Work Description: reuvenlax commented on code in PR #17417:
URL: https://github.com/apache/beam/pull/17417#discussion_r863985565
##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/StorageApiWriteUnshardedRecords.java:
##########
@@ -275,7 +321,13 @@ void flush(RetryManager<AppendRowsResponse,
Context<AppendRowsResponse>> retryMa
offset = this.currentOffset;
this.currentOffset += inserts.getSerializedRowsCount();
}
- return writeStream.appendRows(offset, protoRows);
+ ApiFuture<AppendRowsResponse> response =
writeStream.appendRows(offset, protoRows);
+
inflightWaitSecondsDistribution.update(writeStream.getInflightWaitSeconds());
+ if (writeStream.getInflightWaitSeconds() > 5) {
+ LOG.warn(
+ "Storage Api write delay more than " +
writeStream.getInflightWaitSeconds());
+ }
+ return response;
Review Comment:
getInFlightWaitSeconds does measure the exec time of the lambda for the most
part (the rest of the code merely appends to an in-memory list, which should be
very fast). Locking is better measured with a contention metric I think - need
to see if we have any way of measuring this in the wild.
Issue Time Tracking
-------------------
Worklog Id: (was: 765536)
Time Spent: 0.5h (was: 20m)
> Performance problems when using Storage API writes
> --------------------------------------------------
>
> Key: BEAM-14388
> URL: https://issues.apache.org/jira/browse/BEAM-14388
> Project: Beam
> Issue Type: Improvement
> Components: io-java-gcp
> Reporter: Reuven Lax
> Priority: P2
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)