I agree with Jungtaek. The same case has happened again on RocketMQ.( https://github.com/apache/storm/pull/2518) The following is my advice.
1. Now storm has too many connectors, we can separate the first class connectors from others. The following is a possible list including all existing connectors. First class: - Kafka, - HDFS, - HBase, - Hive, - Redis, - JDBC, - JMS Others: - Solr, - Cassandra, - Elasticsearch, - Event Hubs - RocketMQ - MongoDB - OpenTSDB - Kinesis - Druid - MQTT, - PMML 2. For first class connectors we can leave the code as it is, but release them independently; for other connectors, I prefer to move them to Bahir like the way of Spark/Flink. We can have a communication with the Bahir community, and request to create a https://github.com/apache/bahir-storm.git repo. 2018-02-01 9:10 GMT+08:00 P. Taylor Goetz <ptgo...@gmail.com>: > I’d start with Storm-Kafka-client as an experiment, and if that goes well, > move all connectors to the same model. > > Some connectors are bound to a stable protocol (e.g. JMS, MQTT), some are > bound to frequently changing APIs (e.g. Apache Kafka, cassandra, ES, etc.). > The former tend to be stable in terms of usage patterns and use cases, the > latter case case not so much. For example, consider hdfs integration. It’s > changed a lot in response to different usage patterns. Kafka due to > new/changing APIs. JMS hasn’t changed much at all since it’s tied to a > stable API. > > There’s also the fact that a high percentage of connectors integrate with > the most stable Storm APIs (spout, bolt, trident). The volatile (using the > term loosely) parts of our API affect projects like Mesos and streamparse, > but not the connectors we sponsor. > > -Taylor > > > On Jan 31, 2018, at 7:07 PM, Roshan Naik <ros...@hortonworks.com> wrote: > > > > I was thinking if the any connector is released more frequently, their > quality would be more mature and typically have lower impact on a Storm > release (compared to now) … if we decide to bundle them in Storm as well. > > -roshan > > > > > > On 1/31/18, 4:02 PM, "P. Taylor Goetz" <ptgo...@gmail.com> wrote: > > > > I think we all agree that releasing connectors as part of a Storm > release hinders the frequency of the release cycle for both Storm proper, > as well as connectors. > > > > If that’s the case, then the question is how to proceed. > > > > -Taylor > > > >> On Jan 31, 2018, at 6:46 PM, Roshan Naik <ros...@hortonworks.com> > wrote: > >> > >> One thought is to … > >> - do a frequent separate release > >> - *and also* include the latest stuff along with each Storm release. > >> > >> -roshan > >> > >> > >> On 1/31/18, 10:43 AM, "generalbas....@gmail.com on behalf of Stig > Rohde Døssing" <generalbas....@gmail.com on behalf of > stigdoess...@gmail.com> wrote: > >> > >> Hugo, > >> It's not my impression that anyone is complaining that > storm-kafka-client > >> has been exceptionally buggy, or that we haven't been fixing the > issues as > >> they crop up. The problem is that we're sitting on the fixes for way > longer > >> than is reasonable, and even if we release Storm more often, users > have to > >> go out of their way to know that they should really be using the > latest > >> storm-kafka-client rather than the one that ships with their Storm > >> installation, because the version number of storm-kafka-client > happens to > >> not mean anything regarding compatibility with Storm. > >> > >> Everyone, > >> > >> Most of what I've written here has already been said, but I've already > >> written it so... > >> > >> I really don't see the point in going through the effort of separating > >> connectors out to another repository if we're just going to make the > other > >> repository the second class citizen connector graveyard. > >> > >> The point to separating storm-kafka-client out is so it can get a > release > >> cycle different from Storm, so we can avoid the situation we're in > now in > >> the future. There's obviously a flaw in our process when we have to > choose > >> between breaking semantic versioning and releasing broken software. > >> > >> I agree that it would be good to release Storm a little more often, > but I > >> don't think that fully addresses my concerns. Are we willing to > increment > >> Storm's major version number if a connector needs to break its API > (e.g. as > >> I want to do in https://github.com/apache/storm/pull/2300)? > >> > >> I think a key observation is that Storm's core API is extremely > stable. > >> Storm and the connectors aren't usually tightly coupled in the sense > that > >> e.g. version 1.0.2 of storm-kafka-client would only work with Storm > 1.0.2 > >> and not 1.0.0, so in many cases there's no reason you wouldn't use the > >> latest connector version instead of the one that happens to ship with > the > >> version of Storm you're using. I think it would be attractive if we > could > >> reduce the number of branches of connectors we need to maintain, and > >> instead keep a compatibility matrix between Storm and the connector > in each > >> README, for the rare occasions when the Storm core API changes. > >> > >> +1 for trying out storm-kafka-client with its own release cycle and > >> branches/subrepo/whichever way we want to separate the code, but > still part > >> of the main Storm project JIRA and mailing list. Worst case we merge > it > >> back in after a while. We may want to think about how to do that > before we > >> separate out, just so we don't release e.g. storm-kafka-client 2.3.1 > and > >> then have to merge back to Storm which is still on 2.0.0. > >> > >> 2018-01-31 3:36 GMT+01:00 Jungtaek Lim <kabh...@gmail.com>: > >> > >>> Agreed for this topic: this is not related to current release > candidate and > >>> verifying release candidate is higher priority. > >>> For me I didn't start verifying 1.1.2 / 1.0.6 RC2 because the other > topic I > >>> initiated could affect the current release. I'll post a short notice in > >>> that discussion thread. > >>> > >>> -Jungtaek Lim (HeartSaVioR) > >>> > >>> 2018년 1월 31일 (수) 오전 10:58, P. Taylor Goetz <ptgo...@gmail.com>님이 작성: > >>> > >>>> Hit send on that too soon... > >>>> > >>>> This is an important discussion topic, but has no effect on the > current > >>>> RCs. Id recommend focusing on the current releases and come back to > this > >>>> after getting releases out. > >>>> > >>>> -Taylor > >>>> > >>>>> On Jan 30, 2018, at 8:51 PM, P. Taylor Goetz <ptgo...@gmail.com> > >>> wrote: > >>>>> > >>>>> Also, in the interest of getting releases out, we have 3 open RC > cycles > >>>> in flight. > >>>>> > >>>>> Discussion energy might be better focused on that. > >>>>> > >>>>> -Taylor > >>>>> > >>>>>> On Jan 30, 2018, at 7:52 PM, P. Taylor Goetz <ptgo...@gmail.com> > >>> wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>>> On Jan 30, 2018, at 7:31 PM, Harsha <st...@harsha.io> wrote: > >>>>>>> > >>>>>>> Hi, > >>>>>>> In general connectors are independent of Storm run-time for > >>>> most parts. I.e if the APIs are not changed (storm-core or trident > >>> haven't > >>>> changed in years except the package re-name). You can take the latest > >>>> connector and run in storm 1.0 or higher. So the users doesn't need to > >>>> upgrade their storm cluster just to get a latest connector upgrade. > Which > >>>> they might be doing it but by making the release separate and stating > the > >>>> minimum supported storm version for the connectors will help the > users. > >>>>>>> This makes it easier for the connectors to be released > independently > >>>> of the core/run-time and makes it easy for them to be fixed and > released > >>>> more often. But moving them to Bahir or other external project will > make > >>> it > >>>> detached from Storm itself that it might not see any co-ordination as > >>>> reviewers from storm will need to be aware of an external project. > >>>>>>> My proposal would be > >>>>>>> 1. Can we create a sub-project in git under Storm so we can move > the > >>>> connectors there and everything else related Storm applies there. > >>>>>>> 2. Can we keep maintaining storm connectors within same repo but > >>>> different release module for it . > >>>>>> > >>>>>> +1 That’s exactly my point. Just jettisoning connectors to Bahir > >>>> without commitments from the Storm community would be a mistake. > >>>>>> > >>>>>> Releasing connectors independently can be handled easily at the > Maven > >>>> level. No need for a separate repo initiaially. > >>>>>> > >>>>>> > >>>>>>> > >>>>>>> This is a separate topic but can improve the release timelines if > we > >>>> have multiple release managers that are handling the maint release and > >>> also > >>>> main release versions. Its good to have rotation of release managers > from > >>>> PMC so that everyone will understand the process and can spread the > >>>> responsibilities. There are threads started before but don't think > they > >>> are > >>>> addressed or any action item is taken. We should start another thread > to > >>>> discuss this process as well. > >>>>>> > >>>>>> Breaking up external modules into separately released versions would > >>> be > >>>> a great way to indoctrinate those new to the license grooming and > release > >>>> process. Everyone could participate. > >>>>>> > >>>>>> > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Harsha > >>>>>> > >>>>>> -Taylor > >>>>>> > >>>>>>> > >>>>>>>> On Tue, Jan 30, 2018, at 9:49 AM, Hugo Da Cruz Louro wrote: > >>>>>>>> I think that the bahir approach makes sense for connectors that > >>> don’t > >>>>>>>> fall into the "first class support” category. I am in favor of > >>> moving > >>>>>>>> such lower adoption connectors and have the interested communities > >>>>>>>> support them with the most suitable release cycle. Connectors that > >>> are > >>>>>>>> idle, such as some examples that Jungtaek gave, we should consider > >>>>>>>> removing them altogether, especially if they are so outdated that > >>> they > >>>>>>>> may not even work. > >>>>>>>> > >>>>>>>> Mainstream connectors such as storm-kafka-client should be kept in > >>> the > >>>>>>>> Storm repo. For example, Flink keeps flink-connector-kafka-0.x in > >>> the > >>>>>>>> Flink repo. > >>>>>>>> > >>>>>>>> I am in agreement with Jungtaek when he says: "fixing critical > bugs > >>> in > >>>>>>>> storm-kafka-client should trigger release, instead of waiting for > >>>> Storm > >>>>>>>> core to have some fixes to be worth to release”. Storm’s release > >>>> cadence > >>>>>>>> is currently not very high and one can argue that Storm entirely > >>> could > >>>>>>>> benefit from more frequent releases. If it is sto rm-kafka-client > >>>>>>>> triggering those releases, so be it. Moving forward I do not > expect > >>>> the > >>>>>>>> storm-kafka-client connector to be subject to so many changes that > >>> it > >>>>>>>> would warrant its own release cycle. > >>>>>>>> > >>>>>>>> I also would like to highlight that although storm-kafka-client > has > >>>> been > >>>>>>>> the center of this discussion, as it was mentioned in this > >>>>>>>> thread<https://goo.gl/VY7QTG>, storm-kafka-client has had a much > >>> less > >>>>>>>> rocky road to stability compared to for example storm-kafka. > >>> Therefore > >>>>>>>> it’s worth evaluating if the challenges that we have faced with > >>> storm- > >>>>>>>> kafka-client have been out of norm for such an important and > complex > >>>>>>>> feature, and if they warrant significant changes in how we do > >>> things. > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Hugo > >>>>>>>> > >>>>>>>> On Jan 29, 2018, at 9:18 PM, Jungtaek Lim > >>>>>>>> <kabh...@gmail.com<mailto:kabh...@gmail.com>> wrote: > >>>>>>>> > >>>>>>>> Let me add a proof of my opinion: major patch of storm-eventhubs > >>>> hasn't > >>>>>>>> been getting even a comment over 4 months. > >>>>>>>> https://github.com/apache/storm/pull/2322 > >>>>>>>> > >>>>>>>> I'd rather want to discuss regarding discontinue supporting > >>>> officially if > >>>>>>>> we no longer interest of, or we don't have resource to support, or > >>> any > >>>>>>>> valid reasons. If we agree on discontinue supporting officially, > we > >>>> can > >>>>>>>> move out to other repo. and let it self maintained. It may be able > >>> to > >>>> get > >>>>>>>> attention and have enough contributors so that we feel better to > get > >>>> to > >>>>>>>> Storm core Repository again, or it can be silently forgotten. It > >>>> shouldn't > >>>>>>>> affect Storm core repository at any case. > >>>>>>>> > >>>>>>>> 2018년 1월 30일 (화) 오후 2:03, Jungtaek Lim <kabh...@gmail.com>님이 작성: > >>>>>>>> > >>>>>>>> If we worry about breaking somethings along with our > >>>>>>>> users/consumers/distributors, picking one of less used/updated > >>>> connector as > >>>>>>>> experiment makes more sense to me. It's OK if we want to pick one > of > >>>> most > >>>>>>>> active and widely used connector intentionally to accelerate > >>>> experiment. > >>>>>>>> > >>>>>>>> Decoupling connectors and moving to other repo. like Bahir will > make > >>>> it > >>>>>>>> clear who are having interest of which connectors. storm-eventhubs > >>> for > >>>>>>>> example, major code contributions were done from MS developers. > Now > >>>> they > >>>>>>>> are gone, and I don't know even storm-eventhubs are compatible > with > >>>> recent > >>>>>>>> Azure Eventhub. That's just a one of them. I've seen many > connectors > >>>> in > >>>>>>>> same, or similar, or possible (say truck number 1) situation. > >>>>>>>> > >>>>>>>> -Jungtaek Lim (HeartSaVioR) > >>>>>>>> > >>>>>>>> 2018년 1월 30일 (화) 오후 1:30, P. Taylor Goetz <ptgo...@gmail.com>님이 > 작성: > >>>>>>>> > >>>>>>>> > >>>>>>>> On Jan 29, 2018, at 8:03 PM, Jungtaek Lim <kabh...@gmail.com> > >>> wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> - Do we ensure they're all maintained? > >>>>>>>> -- Did we exclude inactive committers/PMCs for connector's > committer > >>>>>>>> > >>>>>>>> sponsors, and do they have enough committer sponsors after that? > >>>>>>>> > >>>>>>>> > >>>>>>>> Good point. We’ve had some sponsors go silent recently. Maybe ping > >>>>>>>> sponsors and ask if they wish to maintain sponsorship? > >>>>>>>> > >>>>>>>> As a sponsor for a number of connectors, I’ll check on the ones > I’ve > >>>>>>>> sponsored. > >>>>>>>> > >>>>>>>> - Do they all worth to keep maintaining in Storm main repository? > >>>>>>>> > >>>>>>>> > >>>>>>>> Again, that’s a question of whether there is user/dev interest. > >>>>>>>> > >>>>>>>> > >>>>>>>> -- Should we trigger release if we find and resolve > critical/blocker > >>>> issue > >>>>>>>> from them? If not, why we allow to leave the thing which is in > main > >>>>>>>> repository as inconsistent state? > >>>>>>>> > >>>>>>>> > >>>>>>>> Some are tied to fairly well established protocols, some target > >>> really > >>>>>>>> volatile APIs. Bug reports and mailing list activity may not be a > >>> good > >>>>>>>> status indicator. > >>>>>>>> > >>>>>>>> Storm’s Kafka integration was the initial model for the “batteries > >>>>>>>> included” impetus behind `external`. If we want to evolve how that > >>>> works, > >>>>>>>> why not start there, see what works/doesn’t work, and adapt. > >>>>>>>> > >>>>>>>> I don’t want to shock our users/consumers/distributors. > >>>>>>>> > >>>>>>>> > >>>>>>>> -Taylor > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>> > >>> > >> > >> > > > > > > > -- Thanks, Xin