We should focus on the main reason to remove the Kafka 0.9 support. I have the impression that this is mostly to ease the maintenance, but from the current status (and the removal PR [1]), it does not seem like it is a burden to continue supporting 0.9. In any case I am +1 to remove the support for 0.9, but maybe it is a good idea to just wait until the next LTS is decided and do it just after. This way we will still cover existing users for some time.
Creating different modules for different versions of KafkaIO does not make sense because it is even more complicated than just staying the way we are today for not much in return. We better improve the status quo by parametrizing our current tests to validate that KafkaIO works correctly with the different supported versions (so far we only test against version 1.0.0). I filled BEAM-7003 to track this. [1] https://github.com/apache/beam/pull/8186 [2] https://issues.apache.org/jira/browse/BEAM-7003 ps. Actually this discussion brings to the table the issue of removing/deprecated/changing supported versions on parts of the API marked as @Experimental. I will fork a new thread to discuss this. On Wed, Apr 3, 2019 at 6:53 PM Raghu Angadi <ang...@gmail.com> wrote: > > > > On Wed, Apr 3, 2019 at 5:46 AM David Morávek <david.mora...@gmail.com> wrote: >> >> I'd say that APIs we use in KafkaIO are pretty much stable since 0.10 >> release, all reflection based compatibility adapters seem to be aimed for >> 0.9 release (which is 8 major releases behind current Kafka release). >> >> We may take an inspiration from Flink's kafka connector, they maintain >> separate maven artifact for all supported Kafka APIs. This may be the best >> approach as we can still share most of the codebase between versions, have >> compile time checks and also run tests against all of the supported versions. > > > From that page, Flink also moved to single Kafka connector for versions 10.x > and newer. Kafka itself seems to have improved compatibility between client > and broker versions starting 0.11. Not sure if there is any need now to make > multiple versions of KafkaIO versions for 0.9.x etc. Are you suggesting we > should? > > From Flink's page: > "Starting with Flink 1.7, there is a new universal Kafka connector that does > not track a specific Kafka major version. Rather, it tracks the latest > version of Kafka at the time of the Flink release. > > If your Kafka broker version is 1.0.0 or newer, you should use this Kafka > connector. If you use an older version of Kafka (0.11, 0.10, 0.9, or 0.8), > you should use the connector corresponding to the broker version." > > >> >> >> I'm not really comfortable with reflection based adapters as they seem >> fragile and don't provide compile time checks. >> >> On Tue, Apr 2, 2019 at 11:27 PM Austin Bennett <whatwouldausti...@gmail.com> >> wrote: >>> >>> I withdraw my concern -- checked on info on the cluster I will eventually >>> access. It is on 0.8, so I was speaking too soon. Can't speak to rest of >>> user base. >>> >>> On Tue, Apr 2, 2019 at 11:03 AM Raghu Angadi <ang...@gmail.com> wrote: >>>> >>>> Thanks to David Morávek for pointing out possible improvement to KafkaIO >>>> for dropping support for 0.9 since it avoids having a second consumer just >>>> to fetch latest offsets for backlog. >>>> >>>> Ideally we should be dropping 0.9 support for next major release, in fact >>>> better to drop versions before 0.10.1 at the same time. This would further >>>> reduce reflection based calls for supporting multiple versions. If the >>>> users still on 0.9 could stay on current stable release of Beam, dropping >>>> would not affect them. Otherwise, it would be good to hear from them about >>>> how long we need to keep support for old versions. >>>> >>>> I don't think it is good idea to have multiple forks of KafkaIO in the >>>> same repo. If we do go that route, we should fork the entire kafka >>>> directory and rename the main class KafkaIO_Unmaintained :). >>>> >>>> IMHO, so far, additional complexity for supporting these versions is not >>>> that bad. Most of it is isolated to ConsumerSpEL.java & ProducerSpEL.java. >>>> My first preference is dropping support for deprecated versions (and a >>>> deprecate a few more versions, may be till the version that added >>>> transactions around 0.11.x I think). >>>> >>>> I haven't looked into what's new in Kafka 2.x. Are there any features that >>>> KafkaIO should take advantage of? I have not noticed our existing code >>>> breaking. We should certainly certainly support latest releases of Kafka. >>>> >>>> Raghu. >>>> >>>> On Tue, Apr 2, 2019 at 10:27 AM Mingmin Xu <mingm...@gmail.com> wrote: >>>>> >>>>> >>>>> We're still using Kafka 0.10 a lot, similar as 0.9 IMO. To expand >>>>> multiple versions in KafkaIO is quite complex now, and it confuses users >>>>> which is supported / which is not. I would prefer to support Kafka 2.0+ >>>>> only in the latest version. For old versions, there're some options: >>>>> 1). document Kafka-Beam support versions, like what we do in FlinkRunner; >>>>> 2). maintain separated KafkaIOs for old versions; >>>>> >>>>> 1) would be easy to maintain, and I assume there should be no issue to >>>>> use Beam-Core 3.0 together with KafkaIO 2.0. >>>>> >>>>> Any thoughts? >>>>> >>>>> Mingmin >>>>> >>>>> On Tue, Apr 2, 2019 at 9:56 AM Reuven Lax <re...@google.com> wrote: >>>>>> >>>>>> KafkaIO is marked as Experimental, and the comment already warns that >>>>>> 0.9 support might be removed. I think that if users still rely on Kafka >>>>>> 0.9 we should leave a fork (renamed) of the IO in the tree for 0.9, but >>>>>> we can definitely remove 0.9 support from the main IO if we want, >>>>>> especially if it's complicated changes to that IO. If we do though, we >>>>>> should fail with a clear error message telling users to use the Kafka >>>>>> 0.9 IO. >>>>>> >>>>>> On Tue, Apr 2, 2019 at 9:34 AM Alexey Romanenko >>>>>> <aromanenko....@gmail.com> wrote: >>>>>>> >>>>>>> > How are multiple versions of Kafka supported? Are they all in one >>>>>>> > client, or is there a case for forks like ElasticSearchIO? >>>>>>> >>>>>>> They are supported in one client but we have additional “ConsumerSpEL” >>>>>>> adapter which unifies interface difference among different Kafka client >>>>>>> versions (mostly to support old ones 0.9-0.10.0). >>>>>>> >>>>>>> On the other hand, we warn user in Javadoc of KafkaIO (which is >>>>>>> Unstable, btw) by the following: >>>>>>> “KafkaIO relies on kafka-clients for all its interactions with the >>>>>>> Kafka cluster.kafka-clients versions 0.10.1 and newer are supported at >>>>>>> runtime. The older versions 0.9.x - 0.10.0.0 are also supported, but >>>>>>> are deprecated and likely be removed in near future.” >>>>>>> >>>>>>> Despite the fact that, personally, I’d prefer to have only one unified >>>>>>> client interface but, since people still use Beam with old Kafka >>>>>>> instances, we, likely, should stick with it till Beam 3.0. >>>>>>> >>>>>>> WDYT? >>>>>>> >>>>>>> On 2 Apr 2019, at 02:27, Austin Bennett <whatwouldausti...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>> FWIW -- >>>>>>> >>>>>>> On my (desired, not explicitly job-function) roadmap is to tap into a >>>>>>> bunch of our corporate Kafka queues to ingest that data to places I can >>>>>>> use. Those are 'stuck' 0.9, with no upgrade in sight (am told the >>>>>>> upgrade path isn't trivial, is very critical flows, and they are scared >>>>>>> for it to break, so it just sits behind firewalls, etc). But, I >>>>>>> wouldn't begin that for probably at least another quarter. >>>>>>> >>>>>>> I don't contribute to nor understand the burden of maintaining the >>>>>>> support for the older version, so can't reasonably lobby for that >>>>>>> continued pain. >>>>>>> >>>>>>> Anecdotally, this could be a place many enterprises are at (though I >>>>>>> also wonder whether many of the people that would be 'stuck' on such >>>>>>> versions would also have Beam on their current radar). >>>>>>> >>>>>>> >>>>>>> On Mon, Apr 1, 2019 at 2:29 PM Kenneth Knowles <k...@apache.org> wrote: >>>>>>>> >>>>>>>> This could be a backward-incompatible change, though that notion has >>>>>>>> many interpretations. What matters is user pain. Technically if we >>>>>>>> don't break the core SDK, users should be able to use Java SDK >>>>>>>> >=2.11.0 with KafkaIO 2.11.0 forever. >>>>>>>> >>>>>>>> How are multiple versions of Kafka supported? Are they all in one >>>>>>>> client, or is there a case for forks like ElasticSearchIO? >>>>>>>> >>>>>>>> Kenn >>>>>>>> >>>>>>>> On Mon, Apr 1, 2019 at 10:37 AM Jean-Baptiste Onofré >>>>>>>> <j...@nanthrax.net> wrote: >>>>>>>>> >>>>>>>>> +1 to remove 0.9 support. >>>>>>>>> >>>>>>>>> I think it's more interesting to test and verify Kafka 2.2.0 than 0.9 >>>>>>>>> ;) >>>>>>>>> >>>>>>>>> Regards >>>>>>>>> JB >>>>>>>>> >>>>>>>>> On 01/04/2019 19:36, David Morávek wrote: >>>>>>>>> > Hello, >>>>>>>>> > >>>>>>>>> > is there still a reason to keep Kafka 0.9 support? This >>>>>>>>> > unfortunately >>>>>>>>> > adds lot of complexity to KafkaIO implementation. >>>>>>>>> > >>>>>>>>> > Kafka 0.9 was released on Nov 2015. >>>>>>>>> > >>>>>>>>> > My first shot on removing Kafka 0.9 support would remove second >>>>>>>>> > consumer, which is used for fetching offsets. >>>>>>>>> > >>>>>>>>> > WDYT? Is this support worth keeping? >>>>>>>>> > >>>>>>>>> > https://github.com/apache/beam/pull/8186 >>>>>>>>> > >>>>>>>>> > D. >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Jean-Baptiste Onofré >>>>>>>>> jbono...@apache.org >>>>>>>>> http://blog.nanthrax.net >>>>>>>>> Talend - http://www.talend.com >>>>>>> >>>>>>> >>>>> >>>>> >>>>> -- >>>>> ---- >>>>> Mingmin