C0urante commented on a change in pull request #11773:
URL: https://github.com/apache/kafka/pull/11773#discussion_r824130593
##########
File path:
connect/api/src/main/java/org/apache/kafka/connect/source/SourceConnector.java
##########
@@ -28,4 +30,46 @@
protected SourceConnectorContext context() {
return (SourceConnectorContext) context;
}
+
+ /**
+ * Signals whether the connector supports exactly-once delivery guarantees
with a proposed configuration.
+ * Developers can assume that worker-level exactly-once support is enabled
when this method is invoked.
+ *
+ * <p>For backwards compatibility, the default implementation will return
{@code null}, but connector developers are
+ * strongly encouraged to override this method to return a non-null value
such as
+ * {@link ExactlyOnceSupport#SUPPORTED SUPPORTED} or {@link
ExactlyOnceSupport#UNSUPPORTED UNSUPPORTED}.
+ *
+ * <p>Similar to {@link #validate(Map) validate}, this method may be
called by the runtime before the
+ * {@link #start(Map) start} method is invoked when the connector will be
run with exactly-once support.
Review comment:
Oooh, that's an interesting idea.
I think your point about the connector doing extra work is my biggest
concern on that front, and it seems to be validated (heh) by how `validate` is
designed (we never actually invoke `validate` on the same `Connector` instances
that we invoke `start` on at the moment). It may not be very frequent, but
there certainly are some connectors out there that do some heavy lifting in
`start` that we wouldn't want to do every time during preflight validation. For
a practical example, see MirrorMaker2:
https://github.com/apache/kafka/blob/38e3787d760fba6bdac4c91126e87cd7939ae43f/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorSourceConnector.java#L120-L128
Adding a `configure` method would help with this issue by allowing us to
give a long-lived config to a `Connector` that the connector can hold onto
across successive invocations of `exactlyOnceSupport` and
`canDefineTransactionBoundaries` (and possibly even `validate`). But, it would
also complicate some of the existing preflight validation logic in the
framework. Right now, we only create a single `Connector` instance per
connector type per worker for preflight validation, and all invocations of
`validate` and `config` are performed with that instance (see `AbstractHerder`
[here](https://github.com/apache/kafka/blob/38e3787d760fba6bdac4c91126e87cd7939ae43f/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/AbstractHerder.java#L431),
[here](https://github.com/apache/kafka/blob/38e3787d760fba6bdac4c91126e87cd7939ae43f/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/AbstractHerder.java#L664-L666),
and [here](https://github.com/apache/kafka/blob/38e3787d760
fba6bdac4c91126e87cd7939ae43f/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/AbstractHerder.java#L461-L479)).
Since we use these validation-only `Connector` instances pretty loosely and
without any guarantees about order of invocations for `config` and `validate`,
it's unlikely that there are connectors out there that store state in either of
these methods, so there could be relatively low risk for creating a new
validation-only `Connector` instance every time a config has to be validated.
But I have to wonder if we're overthinking things? It seems like we're
trying to optimize away from methods that accept a raw config map and instead
only provide that config once per round of validation. Is the objective there
to create a smoother/better-documented/more-strictly-defined API? Improve
resource utilization by obviating the need of connector developers to construct
`AbstractConfig` subclasses (or equivalents) which parse and validate
individual properties at instantiation time?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]