[ 
https://issues.apache.org/jira/browse/BEAM-12164?focusedWorklogId=753173&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-753173
 ]

ASF GitHub Bot logged work on BEAM-12164:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 06/Apr/22 01:56
            Start Date: 06/Apr/22 01:56
    Worklog Time Spent: 10m 
      Work Description: thiagotnunes commented on code in PR #17200:
URL: https://github.com/apache/beam/pull/17200#discussion_r843395528


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/restriction/ThroughputEstimator.java:
##########
@@ -0,0 +1,114 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner.changestreams.restriction;
+
+import com.google.cloud.Timestamp;
+import java.io.Serializable;
+import java.util.ArrayDeque;
+import java.util.Queue;
+
+/** An estimator to provide an estimate on the throughput of the outputted 
elements. */
+public class ThroughputEstimator implements Serializable {
+
+  private static class Pair<K, V> {
+    private final K first;
+    private final V second;
+
+    public Pair(K first, V second) {
+      this.first = first;
+      this.second = second;
+    }
+
+    public K getFirst() {
+      return first;
+    }
+
+    public V getSecond() {
+      return second;
+    }
+
+    @Override
+    public String toString() {
+      return String.format("first: %s, second: %s", first, second);
+    }
+  }
+
+  private static final long serialVersionUID = -3597929310338724800L;
+  // The start time of each per-second window.
+  private Timestamp startTimeOfCurrentWindow;
+  // The bytes of the current window.
+  private double bytesInCurrentWindow;
+  // The number of seconds to look in the past.
+  private final int numOfSeconds = 60;
+  // The total bytes of all windows in the queue.
+  private double bytesInQueue;
+  // The queue holds a number of windows in the past in order to calculate
+  // a rolling windowing throughput.
+  private Queue<Pair<Timestamp, Double>> queue;
+
+  public ThroughputEstimator() {
+    queue = new ArrayDeque<>();
+  }
+
+  /**
+   * Updates the estimator with the bytes of records.
+   *
+   * @param timeOfRecords the committed timestamp of the records
+   * @param bytes the total bytes of the records
+   */
+  public void update(Timestamp timeOfRecords, double bytes) {
+    if (startTimeOfCurrentWindow == null) {
+      bytesInCurrentWindow = bytes;
+      startTimeOfCurrentWindow = timeOfRecords;
+      return;
+    }
+
+    if (timeOfRecords.getSeconds() < startTimeOfCurrentWindow.getSeconds() + 
1) {

Review Comment:
   Ah, I missed the `+ 1` lol, nevermind





Issue Time Tracking
-------------------

    Worklog Id:     (was: 753173)
    Time Spent: 67h 10m  (was: 67h)

> SpannerIO Change Stream Connector
> ---------------------------------
>
>                 Key: BEAM-12164
>                 URL: https://issues.apache.org/jira/browse/BEAM-12164
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-java-core
>            Reporter: Thiago Nunes
>            Assignee: Thiago Nunes
>            Priority: P2
>             Fix For: 2.37.0
>
>          Time Spent: 67h 10m
>  Remaining Estimate: 0h
>
> We would like to augment the existing Google Cloud SpannerIO connector 
> ([https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java)]
>  with the support for Spanner Change Streams (CDC). CDC support is just being 
> implemented in Spanner and it will be exposed through a gRPC API. We will use 
> such API to create a new SpannerIO.readChangeStream(...) implementation.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to