I still need to go over the entire discussion thread in more detail, but one thing I would like to bring up right way is the proposal to DEPRECATE, and eventually remove, the KafkaSpout with the old Kafka Consumer APIs. The storm-kafka-client KafkaSpout is getting stabilized, and I think we are all in agreement that the storm-kafka KafkaSpout has presented continuous maintainability problems with some fixes that got in not being backwards compatible.
I am pretty confident how things are looking at this point for the KafkaSpout. The Trident Kafka Spout is likely in between alpha and beta, and that should be taken into account. I just recently submitted a PR<https://github.com/apache/storm/pull/2174> with some improvements to the Trident Kafka Spout (including the refactoring done to support manual partition assignment), and there are some customers using it in pre-production. However, it definitely would benefit from some more testing. Thanks, Hugo On Jun 28, 2017, at 7:48 AM, Bobby Evans <[email protected]<mailto:[email protected]>> wrote: +1. If the 1.1 and 1.2 lines start to become difficult to maintain we can look at putting them in maintenance mode too once we have a 2.x release. I am a little nervous about merging a new feature into 1.x branch without first going to master, but I hope that it will not be too much work to port it to master, and I trust the devs on that branch to do the right thing. On a related note we have not done much with feature branches before so I am not sure what we want to do about merging in the new metrics API branch to 1.x. I know for me I have not had time to keep up with the development work going on there. I would at least like to have a pull request put up for review before we merge it in. This would fit with our current bylaws that do not mention feature branches. If all of the changes have already followed the review process then technically I think it is OK to just merge it in, but I still would like to take some time to look at the changes, and especially the new APIs. - Bobby On Wednesday, June 28, 2017, 1:53:34 AM CDT, Jungtaek Lim <[email protected]<mailto:[email protected]>> wrote: That's great news that metrics work is ready! I'm +1 to Taylor's proposal, but in order to respect semantic versioning, I propose some modifications from Taylor's proposal: - create 1.1.x-branch with target version 1.1.1-SNAPSHOT and port back only bug fixes to the 1.1.x-branch - change the target version of 1.x-branch to 1.2.0-SNAPSHOT If we also agree above, I would like to volunteer the back-port work. Thanks, Jungtaek Lim (HeartSaVioR) 2017년 6월 28일 (수) 오전 10:09, Harsha <[email protected]<mailto:[email protected]>>님이 작성: +1 for above stated approach on releasing 1.2.0 with metrics -Harsha On Tue, Jun 27, 2017, at 12:17 PM, P. Taylor Goetz wrote: The work on metrics is ready for a pull request to 1.x-branch from the feature branch. I’ve held off because we haven’t reached consensus on a path forward with the 1.x release lines . I’d like to propose the following for the 1.x line: 1. Create a branch for 1.2 so we have a branch to review the metrics stuff. 2. Release 1.1.1 3. Review/merge metrics work. Port metrics to master. 4. Release 1.2.0 5. Put the entire 1.x line into maintenance mode. Drop support for 1.0.x. (we would only support 1.2.x and 1.1.x which are very closely aligned). Dropping support for 1.0.x line would eliminate the need to maintain one of the fairly heavily diverged branches. The 1.2.x and 1.1.x would be very closely aligned. I just up merged metrics_v2 against 1.x-branch after a while, and there were no conflicts. That would give us a little more bandwidth to focus on 2.0 and needed bug fixes to the 1.x line like some of the issues raised with storm-kafka-client. We could even start releasing alpha/beta versions of 2.0 in parallel to the steps above. Any thoughts on that approach? -Taylor On Jun 24, 2017, at 1:21 AM, Jungtaek Lim <[email protected]<mailto:[email protected]>> wrote: Yes I prefer option 1, but it might depend on the progress of metrics V2. If it can be done within predictable near future I'm OK to pick option 2, but if not, we may be better to focus releasing 2.0.0 and make it really happen. Whichever we go, I feel it's time to track remaining work on Storm 2.0.0. I found some bugs on master branch so filed issues, and we've remaining port work (UI and logviewer). We've some other improvements target for 2.0.0: worker redesign, beam integration, and so on, and we don't track its progress at all. I don't think we should wait for features which progress is not transparent (in other words we don't know when it will be finished). - Jungtaek Lim (HeartSaVioR) 2017년 6월 24일 (토) 오전 5:19, P. Taylor Goetz <[email protected]<mailto:[email protected]>>님이 작성: Bobby/Jungtaek, Are you saying you want to forego the 1.2 “metrics_v2” release and include it only in 2.0? (I ask because that work is already based on 1.x-branch, and forward-porting it to master is relatively simple.) I’d kind of like that work go out soon. If we go with option 1, I would want to see a 2.0 release (even if it’s a “beta” or “preview) before putting the 1.x line into maintenance mode. -Taylor On Jun 23, 2017, at 9:51 AM, Bobby Evans <[email protected]<mailto:[email protected]> wrote: I see 2 ways to address this. 1) We put the 1.x line into maintenance mode like with 0.10. We don't backport anything except bug fixes.2) We backport a lot of the backwards compatible changes from 2.x to 1.x. My personal preference is 1. It makes it clear the direction we want to go in. The biggest issue is that we probably would want to do a 2.x release sooner rather then later. Even if we don't get all of the features that people want, if we just get a release out we can add in new features if they are backwards compatible, or we can create a 3.x line that would have the breaking changes in it. - Bobby On Thursday, June 22, 2017, 7:39:55 PM CDT, Jungtaek Lim < [email protected]<mailto:[email protected]>> wrote: I'd like to bump this again instead of initiating new discussion thread. I had having hard time to create and apply pull requests for both master and 1.x-branch and that's really painful and sometimes blocker for me to do merge step. Two branches are heavily diverged more than between 0.10 and 1.0.0, even IDE can't switch between the branch smoothly. We didn't even address checkstyle issue yet, but after addressing, it could be "completely" diverged. JDK version is another major issue, since the pull requests targeted for master branch are not checked against JDK 7, and some of them make some issues regarding JDK version while porting back. So personally I really would like to see the plan for 1.x version line changed - skipping any minor releases including 1.2.0 - and have epic issue for 2.0.0 and just go ahead. That was our proposed plan indeed. (even proposed plan was having 2.0.0 directly after 1.0.0) Would like to hear everyone's opinions. If we have consensus to not having any minor releases for 1.x version line, I will not port back non-bugfix pull requests to 1.x-branch, and guide contributors to create pull requests against master branch, not 1.x version line. Thanks, Jungtaek Lim (HeartSaVioR) 2017년 6월 4일 (일) 오전 1:17, Alexandre Vermeerbergen < [email protected]<mailto:[email protected]>>님이 작성: +1 for Roshan's suggestion : in our Storm 1.x based supervision system, we're very interested anything that can provide better throughput. 2017-06-03 18:12 GMT+02:00 Roshan Naik <[email protected]<mailto:[email protected]>>: For 2.0 beta … it would be good to incorporate some of the Worker improvements (STORM-2284) IMO. Changes to messaging subsystem can be delivered sooner and my in-progress implementation suggests that it will yield substantial latency improvements. The 2.0 beta phase would really help kick the tires on the revised messaging system and the performance improvements will also be a good incentive for trying out the 2.0 line. I notice multiple other bottlenecks that are holding back throughput a lot, which can be addressed in a subsequent 2.x minor release. -roshan On 6/3/17, 7:20 AM, "Jungtaek Lim" <[email protected]<mailto:[email protected]>> wrote: I also would love to see metrics V2 code sooner or later too. If we can get it before releasing 2.0.0 that will be great, and then maybe we could just move toward to 2.0.0, not adding any improvements to 1.x version line. (And that's what I would want to.) If we would really want to have 1.2.0, I suggest that we make the 1.1.1 version correct right now rather than after releasing 1.1.1. We also merged non-bugfix things to 1.x-branch but that's not what users expect. I agree that work may be painful, but anyway need to do it. - Jungtaek Lim (HeartSaVioR) 2017년 6월 3일 (토) 오전 3:49, Bobby Evans <[email protected]<mailto:[email protected]> 님이 작성: I would love to see the metrics V2 code come out sooner rather than later. +1. My biggest blocker for a 1.x release is https://github.com/apache/storm/pull/2142 Even though the pull request says it is minor it showed that we messed up pushing back some changes for pacemaker to open source (the code does not run at all which for me is a blocker) and I really want to get that fully fixed/tested before another release. As for 2.x I think we are very close to being able to so a 2.x alpha release. I would like to see metrics V2 merged in simply because it is a big change for user facing APIs. But after that I would love to see us starting to push forward on getting that out. - Bobby On Friday, June 2, 2017, 1:39:46 PM CDT, P. Taylor Goetz < [email protected]<mailto:[email protected]>> wrote: I’d like to bump this thread and start a discussion around our next release. Here are my thoughts. There are a number of important fixes in 1.x-branch so I’d like to consider releasing 1.1.1 soon. I’d appreciate input on any open issues that should be resolved for that release. I’d like us to consider releasing the metrics improvements in STORM-2153 [1] as version 1.2.0. That work is in the metrics_v2 feature branch right now and would need to be reviewed and merged. That work is against the 1.x-branch right now. I would recommend porting it to master *after* the review/merge since there will likely be changes as a result of the review. Maybe related to or not, but would we want to create a new branch "1.1.x-branch", and make "1.x-branch" target for 1.2? If wee agree to the above, I would say yes. After the 1.1.1 release, we could create a 1.1.x-branch that would be the maintenance/release branch for that version line. 1.x-branch would then become the target for the 1.2.0 release. There are a few fixes in the 0.10.x branch that probably warrant a release. After that we may want to back away from that version line a bit so we can focus more on newer versions. In the past, we’ve shied away form doing “beta” releases, but I’m wondering if we might want to revisit that for the 2.0 release — the idea being that it would give early adopter users a chance to kick the tires on what’s coming in 2.0 and provide feedback, find bugs, etc. to help make the final release more solid. I’m on the fence here and could go either way. I’d appreciate any input others may have. Thanks, -Taylor [1] https://issues.apache.org/jira/browse/STORM-2153 On Mar 30, 2017, at 9:09 PM, Jungtaek Lim <[email protected]<mailto:[email protected]>> wrote: Maybe related to or not, but would we want to create a new branch "1.1.x-branch", and make "1.x-branch" target for 1.2? I'm not clear we don't release 1.2 for moving toward to 2.0.0, so hence the question. - Jungtaek Lim (HeartSaVioR) 2017년 3월 29일 (수) 오전 1:56, Hugo Da Cruz Louro < [email protected]<mailto:[email protected]>>님이 작성: +1 for finishing the porting to Java ahead of anything else - it will be a significant milestone. I have a JIRA assigned concerning to the porting. I will work on it for the 2.0 release. it’s a priority to guarantee no performance regressions. As part of this endeavor, explore an automated (or easy) way to run and assert major performance benchmarks. Ideally any contributor should be able to fairly easily test the impact of changes under certain performance test scenarios. Beam Runner work should take into account the impact of incorporating new JStorm features and Storm Worker Redesign< https://issues.apache.org/jira/browse/STORM-2284>. Not very efficient to start doing it, to find out that it will have to chance in face of Storm and worker redesign. That is, it should be done after it’s building blocks are stable. Thanks, Hugo On Mar 24, 2017, at 12:07 AM, Arun Mahadevan < [email protected]<mailto:[email protected]> <mailto: [email protected]>> wrote: +1 to release with the porting completed. I think its mainly the UI server and log viewer that’s pending. We can start doing the regression and performance tests for whatever is already ported. If anyone is running the master branch in their pre-prod / prod environments, it will be good to know and give us more confidence. The other features can be added in follow up releases. Regards, Arun On 3/24/17, 11:47 AM, "Satish Duggana" < [email protected] <mailto: [email protected]>> wrote: +1 to have 2.0 with porting and performance(it should be at least as good as 1.x release) issues addressed We can target other tasks(mentioned by Taylor and Jungtaek) for 2.x-branch. Exactly-once support: While thinking through the exactlyonce support design, it is realized better to avoid acking tuples and implement exactly once by snapshotting barriers. It seems JStorm folks followed similar design, they claim it gives better performance. This feature is essential for beam runner and we can decide on respective approaches though. Beam Runner Lets hold on this for now and keep it in Storm till 2.x. We should avoid having a minimal beam runner in haste. It is better to address STORM-2284, exactly-once and other windowing enhancements to enable beam runner. JStorm Agree with Jungtaek on looking at the latest JStorm and align/scope with the features for 2.x. STORM-2284 We may want to look at JStorm worker before working on respective components in this epic to pull appropriate enhancements. YARN/MESOS Supporting Storm on YARN/Mesos for 2.x. Thanks, Satish. On Fri, Mar 24, 2017 at 9:09 AM, Jungtaek Lim < [email protected] <mailto: [email protected]>> wrote: First of all, +1 to complete only port work and do sanity check (including performance regression), and release. If we can get STORM-2284 within deterministic time frame (say 2~3 months) that should be great, but if not I'd in favor of postponing that to later 2.x release. JStorm released their new versions after code donation. So there're more things we could get ideas from, or even adopt from. https://github.com/alibaba/jstorm/blob/master/history.md As you noticed from release note link, we also need to update phase 2 since they already changed what we're planning to do in phase 2. For example, they changed backpressure to end-to-end, and changed to use snapshot rather than acker. May be sure, JStorm pulled many features from today's Storm, like Flux, Windowing, more shuffle groupings, log search, log level change, and so on. STORM-2426 <https://issues.apache.org/jira/browse/STORM-2426> is due to the limitation of Spout lifecycle (all the things are done in single thread), and STORM-1358 < https://issues.apache.org/jira/browse/STORM-1358 (JStorm's multi-thread Spout) can remedy this (despite that Spout implementation may need to guarantee thread-safety later). It's not a just improvement but close to design concern so would like to address sooner than other things in phase 2. For Storm SQL side, I've lost progress but major work would be adopting group by with windowing. It was not available from Calcite but will be available at next release (1.12.0). I've filed this to STORM-2405 <https://issues.apache.org/jira/browse/STORM-2405>, but windowing & micro batch is not intuitive, so I would like to change the underlying API to stream API in SQL. Also filed this to STORM-2406 <https://issues.apache.org/jira/browse/STORM-2406>. Just 2 cents btw, hopefully I would like to see metrics V2 sooner since we lost metrics even when doing normal operation like restarting worker, rebalancing, and so on. Eventually we need to fight with dynamic scaling, and then metrics will be broken often. Thanks, Jungtaek Lim (HeartSaVioR) 2017년 3월 24일 (금) 오전 5:05, Harsha Chintalapani < [email protected] 님이 작성: Storm 2.0 migration to java in itself is a big win and would attract wider community and adoption. So my vote would be to resolve the first 3 items to get a release out. All the other featured mentioned are great to have but shouldn't be blockers for 2.0 release. -Harsha On Thu, Mar 23, 2017 at 11:51 AM P. Taylor Goetz < [email protected]> wrote: With the 1.1.0 release nearing completion, I’d like to turn our attention to 2.0 and develop a plan for what features, etc. to include. The following 3 are what I feel are the minimum for a 2.0 release. These could likely be resolved relatively quickly: * Performance — I’ve not benchmarked the master branch vs. 1.0.x or 1.1.x in a while, but I feel it will be important to make sure there are no performance regressions, and would hope that we actually have a performance improvement over previous versions. To that end (e.g. if there is in fact a performance regression), the proposals that Roshan Naik put together for revising the threading and execution model (STORM-2307) and replacing Disruptor with JCTools (STORM-2306) warrant review and consideration. See also STORM-2284 which is the parent JIRA. * Finish porting Storm UI to java (STORM-1311) * Finish porting log viewer to java (STORM-1280) The following are items that are nice to have in 2.0, but I don’t feel are absolutely necessary for an initial 2.0 release: * Beam Runner (I wouldn’t tie this to 2.0, mentioning it because it’s relevant) — Initially there seemed to be a lot of interest in this, but that seems to have trailed off. I spoke with some Beam developers and there seems to be interest from that community as well. Do we want to move that effort to the Beam community, or keep it here? Moving it to the Beam community might lead to better collaboration between projects. * Bounded Spouts (needed for Beam Runner implementation) — Currently spouts are unbounded, there no end to the stream. Beam has the concept of bounded sources (roughly analogous to batch processing). To support that, we would need to implement a similar concept in Storm. One benefit of such a feature would be the ability to handle both bounded and unbounded workflows in Storm. * Storm-SQL — Jungtaek/Xin: You have been the primary drivers behind this effort. What improvements do you envision for 2.0? * Metrics V2 (STORM-2153: Coda Hale Metrics) — I’ve been targeting this for 1.2.0, but it’s designed to be easily portable to master/2.0. * JStorm Migration — Original outline can be found here [1]. Note a lot of the associated JIRAs below are assigned, but there hasn’t been any recent activity or pull requests, we should probably consider them unassigned and up for grabs.: * Worker Classloader Isolation (STORM-1338) — Lack of this has been the bane of a lot of Storm users almost since day one. We have largely addressed it by shading/relocating dependencies. It would be great to see this addressed once and for all. * JStorm back pressure implementation (STORM-1324) — The current back pressure implementation leaves a bit to be desired, and the JStorm approach looks promising, though it also depends on the JStorm concept of “topology master” (STORM-1323), which may have some implications regarding security. * Dynamic Topology Updates (STORM-1335) — This would provide a command to update topology jars and configuration without stopping the topology, and is well suited to leverage the blobstore. The restart command (that can also update the topology configuration) also looks compelling (STORM-1334). * Additional Scheduler Implementations (STORM-1320) * Additional Grouping Implementations (STORM-1328) As always I’m open to any opinions and suggestions. -Taylor [1] https://cwiki.apache.org/confluence/pages/viewpage. action?pageId=61328109
