[ 
https://issues.apache.org/jira/browse/BEAM-10259?focusedWorklogId=450584&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450584
 ]

ASF GitHub Bot logged work on BEAM-10259:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 24/Jun/20 19:06
            Start Date: 24/Jun/20 19:06
    Worklog Time Spent: 10m 
      Work Description: allenpradeep commented on a change in pull request 
#12010:
URL: https://github.com/apache/beam/pull/12010#discussion_r445110010



##########
File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerAccessor.java
##########
@@ -31,33 +31,73 @@
 import io.grpc.ClientCall;
 import io.grpc.ClientInterceptor;
 import io.grpc.MethodDescriptor;
+import java.util.concurrent.ConcurrentHashMap;
 import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicInteger;
 import org.apache.beam.sdk.options.ValueProvider;
 import org.apache.beam.sdk.util.ReleaseInfo;
 import org.joda.time.Duration;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /** Manages lifecycle of {@link DatabaseClient} and {@link Spanner} instances. 
*/
 class SpannerAccessor implements AutoCloseable {
+  private static final Logger LOG = 
LoggerFactory.getLogger(SpannerAccessor.class);
+
   // A common user agent token that indicates that this request was originated 
from Apache Beam.
   private static final String USER_AGENT_PREFIX = "Apache_Beam_Java";
 
+  // Only create one SpannerAccessor for each different SpannerConfig.
+  private static final ConcurrentHashMap<SpannerConfig, SpannerAccessor> 
spannerAccessors =
+      new ConcurrentHashMap<>();
+
+  // Keep reference counts of each SpannerAccessor's usage so that we can close
+  // it when it is no longer in use.
+  private static final ConcurrentHashMap<SpannerConfig, AtomicInteger> 
refcounts =
+      new ConcurrentHashMap<>();
+
   private final Spanner spanner;
   private final DatabaseClient databaseClient;
   private final BatchClient batchClient;
   private final DatabaseAdminClient databaseAdminClient;
+  private final SpannerConfig spannerConfig;
 
   private SpannerAccessor(
       Spanner spanner,
       DatabaseClient databaseClient,
       DatabaseAdminClient databaseAdminClient,
-      BatchClient batchClient) {
+      BatchClient batchClient,
+      SpannerConfig spannerConfig) {
     this.spanner = spanner;
     this.databaseClient = databaseClient;
     this.databaseAdminClient = databaseAdminClient;
     this.batchClient = batchClient;
+    this.spannerConfig = spannerConfig;
   }
 
-  static SpannerAccessor create(SpannerConfig spannerConfig) {

Review comment:
       NIT: Should we maintain this function for compatibility sake? There 
might be external customers who may be using create call in their template. 
Would they be affected when they start using this code.
   Internally, this can call getOrCreate()
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 450584)
    Time Spent: 1h 40m  (was: 1.5h)

> Spanner Session leak/overload in Streaming Dataflow
> ---------------------------------------------------
>
>                 Key: BEAM-10259
>                 URL: https://issues.apache.org/jira/browse/BEAM-10259
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-gcp
>    Affects Versions: 2.18.0, 2.19.0, 2.21.0, 2.22.0
>            Reporter: Niel Markwick
>            Assignee: Niel Markwick
>            Priority: P2
>              Labels: dataflow, gcp, io
>             Fix For: 2.23.0
>
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> SpannerIO.WriteToSpannerFn connects to Spanner every time @Setup is called, 
> and closes the connection every time @Teardown is called. 
> This actually creates a separate Spanner connection and session pool for each 
> WriteToSpannerFn, which generally speaking is one per thread
> In single-threaded runners (eg batch dataflow on a single vCPU machine) this 
> is not an issue, as there is normally only one WriteToSpannerFn per 
> node/process.
> In multi-threaded runners (eg streaming dataflow, or batch on multiple CPU 
> machines), this can cause a problem with many session pools created (1 per 
> thread) which can cause a respource leak, and is in general wasteful.
> Spanner connections (and session pools) should be shared among all threads of 
> a single process. so that the connection is only opened and closed once.
> [~alxavier]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to