[jira] [Work logged] (BEAM-5445) Update SpannerIO to support unbounded writes

ASF GitHub Bot (JIRA) Tue, 23 Oct 2018 05:12:18 -0700


     [ 
https://issues.apache.org/jira/browse/BEAM-5445?focusedWorklogId=157515&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-157515
 ]


ASF GitHub Bot logged work on BEAM-5445:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 23/Oct/18 12:11
            Start Date: 23/Oct/18 12:11
    Worklog Time Spent: 10m 
      Work Description: nielm commented on a change in pull request #6478: 
[BEAM-5445] [BEAM-4796] [BEAM-3516] SpannerIO: Only batch on the current 
bundle. Adds streaming support
URL: https://github.com/apache/beam/pull/6478#discussion_r227359936
 
 

 ##########
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
 ##########
 @@ -781,11 +817,29 @@ public Write withFailureMode(FailureMode failureMode) {
       return toBuilder().setFailureMode(failureMode).build();
     }
 
-    /** Specifies the cell mutation limit. */
+    /** Specifies the cell mutation limit (maxumum number of mutated cells per 
batch). */
     public Write withMaxNumMutations(long maxNumMutations) {
       return toBuilder().setMaxNumMutations(maxNumMutations).build();
     }
 
+    /**
+     * Specifies an input PCollection that can be used with a {@code 
Wait.on(signal)} to indicate
+     * when the database schema is ready. To be used when the schema creation 
is part of the
+     * pipeline to prevent the connector reading the schema too early.
+     */
+    public Write withSchemaReadySignal(PCollection signal) {
 
 Review comment:
   AFAIK, using 
[Wait.on(PCollection)](https://beam.apache.org/releases/javadoc/2.7.0/org/apache/beam/sdk/transforms/Wait.html)
 is the only way to 'pause' one pipeline until the results of a dependent 
section of the pipeline is complete. 
   The given PCollection is the Signal to be passed to Wait.on()
   Callbacks would not work because this needs to be triggered as part of the 
running pipeline. 
   
   This transform reads the schema on start-up (line 914), if the schema is 
created by a separate section of the pipeline, there would be a race condition 
between creating the schema, and this transform reading it.
   
   Improved javadoc with a link to Wait.OnSignal and made it clearer when/why 
this would be used.
   This API is also described in the class level Javadoc `Database Schema 
Preparation`
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 157515)
    Time Spent: 3.5h  (was: 3h 20m)

> Update SpannerIO to support unbounded writes
> --------------------------------------------
>
>                 Key: BEAM-5445
>                 URL: https://issues.apache.org/jira/browse/BEAM-5445
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-gcp
>            Reporter: Chamikara Jayalath
>            Assignee: Niel Markwick
>            Priority: Major
>             Fix For: 2.9.0
>
>          Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Currently, due to a known issue, streaming pipelines that use SpannerIO.Write 
> do not actually write to Spanner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5445) Update SpannerIO to support unbounded writes

Reply via email to