[ https://issues.apache.org/jira/browse/FLINK-10900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16689580#comment-16689580 ]
ASF GitHub Bot commented on FLINK-10900: ---------------------------------------- pnowojski closed pull request #7109: [FLINK-10900][kafka][docs] Mark universal Kafka connector as beta URL: https://github.com/apache/flink/pull/7109 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/docs/dev/connectors/kafka.md b/docs/dev/connectors/kafka.md index f81080a4b53..0630c6ec7d6 100644 --- a/docs/dev/connectors/kafka.md +++ b/docs/dev/connectors/kafka.md @@ -73,7 +73,7 @@ For most users, the `FlinkKafkaConsumer08` (part of `flink-connector-kafka`) is <td>This connector supports <a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-32+-+Add+timestamps+to+Kafka+message">Kafka messages with timestamps</a> both for producing and consuming.</td> </tr> <tr> - <td>flink-connector-kafka-0.11_2.11</td> + <td>flink-connector-kafka-0.11{{ site.scala_version_suffix }}</td> <td>1.4.0</td> <td>FlinkKafkaConsumer011<br> FlinkKafkaProducer011</td> @@ -81,13 +81,25 @@ For most users, the `FlinkKafkaConsumer08` (part of `flink-connector-kafka`) is <td>Since 0.11.x Kafka does not support scala 2.10. This connector supports <a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging">Kafka transactional messaging</a> to provide exactly once semantic for the producer.</td> </tr> <tr> - <td>flink-connector-kafka_2.11</td> + <td>flink-connector-kafka{{ site.scala_version_suffix }}</td> <td>1.7.0</td> <td>FlinkKafkaConsumer<br> FlinkKafkaProducer</td> <td>>= 1.0.0</td> - <td>This Kafka connector attempts to track the latest version of the Kafka client. The version of the client it uses may change between Flink releases. Modern Kafka clients are backwards compatible with broker versions 0.10.0 or later. However for Kafka 0.11.x and 0.10.x versions, we recommend using dedicated flink-connector-kafka-0.11 and link-connector-kafka-0.10 respectively.</td> - </tr> + <td> + This universal Kafka connector attempts to track the latest version of the Kafka client. + The version of the client it uses may change between Flink releases. + Modern Kafka clients are backwards compatible with broker versions 0.10.0 or later. + However for Kafka 0.11.x and 0.10.x versions, we recommend using dedicated + flink-connector-kafka-0.11{{ site.scala_version_suffix }} and link-connector-kafka-0.10{{ site.scala_version_suffix }} respectively. + <div class="alert alert-warning"> + <strong>Attention:</strong> as of Flink 1.7 the universal Kafka connector is considered to be + in a <strong>BETA</strong> status and might not be as stable as the 0.11 connector. + In case of problems with the universal connector, you can try to use flink-connector-kafka-0.11{{ site.scala_version_suffix }} + which should be compatible with all of the Kafka versions starting from 0.11. + </div> + </td> + </tr> </tbody> </table> @@ -101,7 +113,8 @@ Then, import the connector in your maven project: </dependency> {% endhighlight %} -Note that the streaming connectors are currently not part of the binary distribution. See how to link with them for cluster execution [here]({{ site.baseurl}}/dev/linking.html). +Note that the streaming connectors are currently not part of the binary distribution. +See how to link with them for cluster execution [here]({{ site.baseurl}}/dev/linking.html). ## Installing Apache Kafka @@ -110,17 +123,21 @@ Note that the streaming connectors are currently not part of the binary distribu ## Kafka 1.0.0+ Connector -Starting with Flink 1.7, there is a new Kafka connector that does not track a specific Kafka major version. Rather, it tracks the latest version of Kafka at the time of the Flink release. +Starting with Flink 1.7, there is a new universal Kafka connector that does not track a specific Kafka major version. +Rather, it tracks the latest version of Kafka at the time of the Flink release. -If your Kafka broker version is 1.0.0 or newer, you should use this Kafka connector. If you use an older version of Kafka (0.11, 0.10, 0.9, or 0.8), you should use the connector corresponding to the broker version. +If your Kafka broker version is 1.0.0 or newer, you should use this Kafka connector. +If you use an older version of Kafka (0.11, 0.10, 0.9, or 0.8), you should use the connector corresponding to the broker version. ### Compatibility -The modern Kafka connector is compatible with older and newer Kafka brokers through the compatibility guarantees of the Kafka client API and broker. The modern Kafka connector is compatible with broker versions 0.11.0 or later, depending on the features used. For details on Kafka compatibility, please refer to the [Kafka documentation](https://kafka.apache.org/protocol.html#protocol_compatibility). +The universal Kafka connector is compatible with older and newer Kafka brokers through the compatibility guarantees of the Kafka client API and broker. +It is compatible with broker versions 0.11.0 or newer, depending on the features used. +For details on Kafka compatibility, please refer to the [Kafka documentation](https://kafka.apache.org/protocol.html#protocol_compatibility). ### Usage -The use of the modern Kafka connector add a dependency to it: +To use the universal Kafka connector add a dependency to it: {% highlight xml %} <dependency> @@ -130,11 +147,14 @@ The use of the modern Kafka connector add a dependency to it: </dependency> {% endhighlight %} -Then instantiate the new source (`FlinkKafkaConsumer`) and sink (`FlinkKafkaProducer`). The API is the backwards compatible with the older Kafka connectors. +Then instantiate the new source (`FlinkKafkaConsumer`) and sink (`FlinkKafkaProducer`). +The API is backward compatible with the Kafka 0.11 connector, +except of dropping specific Kafka version from the module and class names. ## Kafka Consumer -Flink's Kafka consumer is called `FlinkKafkaConsumer08` (or 09 for Kafka 0.9.0.x versions, etc. or just `FlinkKafkaConsumer` for Kafka >= 1.0.0 versions). It provides access to one or more Kafka topics. +Flink's Kafka consumer is called `FlinkKafkaConsumer08` (or 09 for Kafka 0.9.0.x versions, etc. +or just `FlinkKafkaConsumer` for Kafka >= 1.0.0 versions). It provides access to one or more Kafka topics. The constructor accepts the following arguments: @@ -516,7 +536,10 @@ the `Watermark getCurrentWatermark()` (for periodic) or the `Watermark checkAndGetNextWatermark(T lastElement, long extractedTimestamp)` (for punctuated) is called to determine if a new watermark should be emitted and with which timestamp. -**Note**: If a watermark assigner depends on records read from Kafka to advance its watermarks (which is commonly the case), all topics and partitions need to have a continuous stream of records. Otherwise, the watermarks of the whole application cannot advance and all time-based operations, such as time windows or functions with timers, cannot make progress. A single idle Kafka partition causes this behavior. +**Note**: If a watermark assigner depends on records read from Kafka to advance its watermarks +(which is commonly the case), all topics and partitions need to have a continuous stream of records. +Otherwise, the watermarks of the whole application cannot advance and all time-based operations, +such as time windows or functions with timers, cannot make progress. A single idle Kafka partition causes this behavior. A Flink improvement is planned to prevent this from happening (see [FLINK-5479: Per-partition watermarks in FlinkKafkaConsumer should consider idle partitions]( https://issues.apache.org/jira/browse/FLINK-5479)). @@ -729,7 +752,8 @@ application before first checkpoint completes, by factor larger then `FlinkKafka ## Using Kafka timestamps and Flink event time in Kafka 0.10 -Since Apache Kafka 0.10+, Kafka's messages can carry [timestamps](https://cwiki.apache.org/confluence/display/KAFKA/KIP-32+-+Add+timestamps+to+Kafka+message), indicating +Since Apache Kafka 0.10+, Kafka's messages can carry +[timestamps](https://cwiki.apache.org/confluence/display/KAFKA/KIP-32+-+Add+timestamps+to+Kafka+message), indicating the time the event has occurred (see ["event time" in Apache Flink](../event_time.html)) or the time when the message has been written to the Kafka broker. @@ -789,17 +813,20 @@ Flink provides first-class support through the Kafka connector to authenticate t configured for Kerberos. Simply configure Flink in `flink-conf.yaml` to enable Kerberos authentication for Kafka like so: 1. Configure Kerberos credentials by setting the following - - - `security.kerberos.login.use-ticket-cache`: By default, this is `true` and Flink will attempt to use Kerberos credentials in ticket caches managed by `kinit`. - Note that when using the Kafka connector in Flink jobs deployed on YARN, Kerberos authorization using ticket caches will not work. This is also the case when deploying using Mesos, as authorization using ticket cache is not supported for Mesos deployments. + - `security.kerberos.login.use-ticket-cache`: By default, this is `true` and Flink will attempt to use Kerberos credentials in ticket caches managed by `kinit`. + Note that when using the Kafka connector in Flink jobs deployed on YARN, Kerberos authorization using ticket caches will not work. + This is also the case when deploying using Mesos, as authorization using ticket cache is not supported for Mesos deployments. - `security.kerberos.login.keytab` and `security.kerberos.login.principal`: To use Kerberos keytabs instead, set values for both of these properties. 2. Append `KafkaClient` to `security.kerberos.login.contexts`: This tells Flink to provide the configured Kerberos credentials to the Kafka login context to be used for Kafka authentication. -Once Kerberos-based Flink security is enabled, you can authenticate to Kafka with either the Flink Kafka Consumer or Producer by simply including the following two settings in the provided properties configuration that is passed to the internal Kafka client: +Once Kerberos-based Flink security is enabled, you can authenticate to Kafka with either the Flink Kafka Consumer or Producer +by simply including the following two settings in the provided properties configuration that is passed to the internal Kafka client: - Set `security.protocol` to `SASL_PLAINTEXT` (default `NONE`): The protocol used to communicate to Kafka brokers. When using standalone Flink deployment, you can also use `SASL_SSL`; please see how to configure the Kafka client for SSL [here](https://kafka.apache.org/documentation/#security_configclients). -- Set `sasl.kerberos.service.name` to `kafka` (default `kafka`): The value for this should match the `sasl.kerberos.service.name` used for Kafka broker configurations. A mismatch in service name between client and server configuration will cause the authentication to fail. +- Set `sasl.kerberos.service.name` to `kafka` (default `kafka`): The value for this should match the `sasl.kerberos.service.name` used for Kafka broker configurations. +A mismatch in service name between client and server configuration will cause the authentication to fail. For more information on Flink configuration for Kerberos security, please see [here]({{ site.baseurl}}/ops/config.html). You can also find [here]({{ site.baseurl}}/ops/security-kerberos.html) further details on how Flink internally setups Kerberos-based security. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Mark Kafka 2.0 connector as beta feature > ---------------------------------------- > > Key: FLINK-10900 > URL: https://issues.apache.org/jira/browse/FLINK-10900 > Project: Flink > Issue Type: Task > Components: Kafka Connector > Affects Versions: 1.7.0 > Reporter: Till Rohrmann > Assignee: Piotr Nowojski > Priority: Blocker > Labels: pull-request-available > Fix For: 1.7.0 > > > Given the test problems with the Kafka 2.0 connector we should mark this > connector as a beta feature until we have fully understood why so many tests > deadlock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)