CrynetLogistics commented on a change in pull request #17687:
URL: https://github.com/apache/flink/pull/17687#discussion_r743047568
##########
File path:
flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/AsyncSinkBase.java
##########
@@ -49,6 +50,28 @@
public abstract class AsyncSinkBase<InputT, RequestEntryT extends Serializable>
implements Sink<InputT, Void, Collection<RequestEntryT>, Void> {
+ protected final ElementConverter<InputT, RequestEntryT> elementConverter;
+ protected final int maxBatchSize;
+ protected final int maxInFlightRequests;
+ protected final int maxBufferedRequests;
+ protected final long flushOnBufferSizeInBytes;
+ protected final long maxTimeInBufferMS;
Review comment:
I would have thought the concrete implementation of the sink would need
to access these values when creating the `SinkWriter`? e.g. here
flink-connectors/flink-connector-aws/src/main/java/org/apache/flink/connector/kinesis/sink/KinesisDataStreamsSink.java
on line 111
##########
File path:
flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/AsyncSinkBase.java
##########
@@ -49,6 +50,28 @@
public abstract class AsyncSinkBase<InputT, RequestEntryT extends Serializable>
implements Sink<InputT, Void, Collection<RequestEntryT>, Void> {
+ protected final ElementConverter<InputT, RequestEntryT> elementConverter;
+ protected final int maxBatchSize;
+ protected final int maxInFlightRequests;
+ protected final int maxBufferedRequests;
+ protected final long flushOnBufferSizeInBytes;
+ protected final long maxTimeInBufferMS;
Review comment:
I would have thought the concrete implementation of the sink would need
to access these values when creating the `SinkWriter`? e.g. here
`flink-connectors/flink-connector-aws/src/main/java/org/apache/flink/connector/kinesis/sink/KinesisDataStreamsSink.java`
on line 111
##########
File path:
flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/AsyncSinkBaseBuilder.java
##########
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.base.sink;
+
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.connector.base.sink.writer.ElementConverter;
+
+import java.io.Serializable;
+
+/**
+ * Abstract builder for constructing a concrete implementation of {@link
AsyncSinkBase}.
+ *
+ * @param <InputT> type of elements that should be persisted in the destination
+ * @param <RequestEntryT> type of payload that contains the element and
additional metadata that is
+ * required to submit a single element to the destination
+ * @param <ConcreteBuilderT> type of concrete implementation of this builder
class
+ */
+@PublicEvolving
+public abstract class AsyncSinkBaseBuilder<
+ InputT,
+ RequestEntryT extends Serializable,
+ ConcreteBuilderT extends AsyncSinkBaseBuilder<?, ?, ?>> {
+
+ protected ElementConverter<InputT, RequestEntryT> elementConverter;
+ protected Integer maxBatchSize;
+ protected Integer maxInFlightRequests;
+ protected Integer maxBufferedRequests;
+ protected Long flushOnBufferSizeInBytes;
+ protected Long maxTimeInBufferMS;
Review comment:
Wouldn't the concrete builder need to verify & use these like here
`flink-connectors/flink-connector-aws/src/main/java/org/apache/flink/connector/kinesis/sink/KinesisDataStreamsSinkBuilder.java`
##########
File path:
flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/AsyncSinkWriter.java
##########
@@ -172,6 +229,15 @@ public AsyncSinkWriter(
this.inFlightRequestsCount = 0;
this.bufferedRequestEntriesTotalSizeInBytes = 0;
+
+ this.metrics = context.metricGroup();
+ this.metrics.setCurrentSendTimeGauge(
+ () -> {
+ long time = this.ackTime - this.lastSendTimestamp;
+ return time < 0 ? this.lastSendDuration : time;
Review comment:
It's true, race conditions between the in-flight requests occurs. I've
modified it such that updates to `ackTime` and `lastSendTimestamp` occur in the
same single threaded block now. `lastSendDuration` has been removed since it's
not necessary. Added a test to prove too.
##########
File path:
flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/AsyncSinkWriter.java
##########
@@ -114,9 +134,38 @@
*
* <p>The method is invoked with a set of request entries according to the
buffering hints (and
* the valid limits of the destination). The logic then needs to create
and execute the request
- * against the destination (ideally by batching together multiple request
entries to increase
- * efficiency). The logic also needs to identify individual request
entries that were not
- * persisted successfully and resubmit them using the {@code
requeueFailedRequestEntry} method.
+ * asynchronously against the destination (ideally by batching together
multiple request entries
+ * to increase efficiency). The logic also needs to identify individual
request entries that
+ * were not persisted successfully and resubmit them using the {@code
requestResult} callback.
+ *
+ * <p>From a threading perspective, the mailbox thread will call this
method and initiate the
+ * asynchronous request to persist the {@code requestEntries}. NOTE: The
client must support
+ * asynchronous requests and the method called to persist the records must
asynchronously
Review comment:
As you mentioned in chat, this design is from flip-171 - idea is to have
the concrete sink implementer to pull in an async client. (even though people
will have the option to add a thread pool there and have a sync client if they
so desire)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]