[
https://issues.apache.org/jira/browse/BEAM-10259?focusedWorklogId=450537&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450537
]
ASF GitHub Bot logged work on BEAM-10259:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 24/Jun/20 17:09
Start Date: 24/Jun/20 17:09
Worklog Time Spent: 10m
Work Description: nielm commented on a change in pull request #12010:
URL: https://github.com/apache/beam/pull/12010#discussion_r445045164
##########
File path:
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -1373,10 +1375,22 @@ public void processElement(ProcessContext c) {
@VisibleForTesting
static class WriteToSpannerFn extends DoFn<Iterable<MutationGroup>, Void> {
- private transient SpannerAccessor spannerAccessor;
private final SpannerConfig spannerConfig;
private final FailureMode failureMode;
+ // Only create one SpannerAccessor for each different SpannerConfig.
Review comment:
Good call -- moved to SpannerAccessor.
To make it clearer, SpannerAccessor.create(SpannerConfig) has been renamed
to getOrCreate(SpannerConfig).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 450537)
Time Spent: 50m (was: 40m)
> Spanner Session leak/overload in Streaming Dataflow
> ---------------------------------------------------
>
> Key: BEAM-10259
> URL: https://issues.apache.org/jira/browse/BEAM-10259
> Project: Beam
> Issue Type: Bug
> Components: io-java-gcp
> Affects Versions: 2.18.0, 2.19.0, 2.21.0, 2.22.0
> Reporter: Niel Markwick
> Assignee: Niel Markwick
> Priority: P2
> Labels: dataflow, gcp, io
> Fix For: 2.23.0
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> SpannerIO.WriteToSpannerFn connects to Spanner every time @Setup is called,
> and closes the connection every time @Teardown is called.
> This actually creates a separate Spanner connection and session pool for each
> WriteToSpannerFn, which generally speaking is one per thread
> In single-threaded runners (eg batch dataflow on a single vCPU machine) this
> is not an issue, as there is normally only one WriteToSpannerFn per
> node/process.
> In multi-threaded runners (eg streaming dataflow, or batch on multiple CPU
> machines), this can cause a problem with many session pools created (1 per
> thread) which can cause a respource leak, and is in general wasteful.
> Spanner connections (and session pools) should be shared among all threads of
> a single process. so that the connection is only opened and closed once.
> [~alxavier]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)