FYI I just gave it a try with separating 1.x branch and 1.1.x branch (sure experimental in forked repo)
https://github.com/heartsavior/storm/tree/1.1.x-branch-experimental I've updated CHANGELOG only once in that branch so you can see full of the changelog which contains the issues ported back. https://github.com/HeartSaVioR/storm/commit/81e9d65793abc5defc0ab83c09b26a7dcba7e0eb Most of the issues are classified to the bug fix, but there're also some issues filtered out. (For example, wildcard classpath, refactor storm-autocreds, binary storm-redis state, and so on) Please let me know what's your opinion on filtering out non-bugfix issues from 1.1.1. If there's no objection I'll do the change: - rename the branch to 1.1.x-branch and push - change the version of 1.x-branch to 1.2.0-SNAPSHOT - reflect the version change to JIRA issues Thanks, Jungtaek Lim (HeartSaVioR) 2017년 6월 29일 (목) 오전 6:41, Alexandre Vermeerbergen <avermeerber...@gmail.com>님이 작성: > Hi Hugo, > > Thanks for your concerns about our troubles with the new > storm-kafka-client. > > Our "bench" is based on our live production data of our cloud supervision > system, collecting at least 1million metrics/min in our Kafka Brokers > cluster (currently based on Kafka 0.10.1.0, with "compatibility flag > active"). > > More details are available in the thread in the same dev list entitled "Lag > issues using Storm 1.1.1 latest build with StormKafkaClient 1.1.1 vs old > StormKafka spouts". > > The latest post on this thread from Stig Døssing is giving me back some > hope to see some progress in understanding our issues. > > My point about writing our own Spout come from our past experience: we've > been using Kafka for a very long time in our supervision application. Way > before we decided to use Storm, we had our own Java daemons consuming the > same topics as today, doing some evaluation and writing them into an > in-memory store for later consumption by our web services - see this as > a"poor man's streaming system" ;-) In this legacy code, the part in charge > of consuming data from Kafka wasn't the most complex which we had: a small > pool of threads using the old Kafka consumer API... so maybe I'm wrong, but > for *this* purpose I do not feel like writing a Spout consuming a few > topics to be a big effort. But of course, if we do that, then we'll miss > the fancy integration in StormUI, flux, and the ability to subscribe to > multiple topics based on a wildcard expression. > > So we're going to carefully dig into Stig's answer, and probably provide > more details on our bench before jumping into our home-brewed Kafka > spout... then I'll have to make some decision based on how much progress we > have vs time remaining before our next delivery gate. > > Hope it clarifies my position with regard to storm-kafka-client > > Best regards, > Alexandre > > > > > > > 2017-06-28 22:49 GMT+02:00 Hugo Da Cruz Louro <hlo...@hortonworks.com>: > > > Hi Alexandre, > > > > In my benchmarks the storm-kafka-client spout improves throughput by 70% > > and latency by 40% vs the storm-kafka implementation. I am surprised by > > your findings substantiating the opposite. Can you share your benchmark > > where you compare the performances of both implementations? > > > > As for you writing your own version of the spout. Why not contribute to > > this one instead? Do you think the implementation is that poor? If so, > why > > do you think it is that poor? Do you expect your first version to be much > > better than a version that is already in production in several customers, > > and seems to be working fairly well? > > > > All the bugs found so far have been addressed. Since it’s a new feature > > there may be a few bugs - it is expected. However, I don’t think that it > is > > as bad as you make it sound as there are several people using it in > > production for extended periods of time. > > > > Cheers > > > > > On Jun 28, 2017, at 12:04 PM, Alexandre Vermeerbergen < > > avermeerber...@gmail.com> wrote: > > > > > > Hello, > > > > > > If that matters, our current experiences with StormKafkaClient > > > isdisappointing (see my recent posts "Lag issues using Storm 1.1.1 > latest > > > build with StormKafkaClient 1.1.1 vs old StormKafka spouts" in this > > mailing > > > list). > > > > > > Our current experience is that the old StormKafka spout always beats > the > > > new one in term of performance & stability. > > > > > > Therefore, I am surprised when I see talks about deprecation of the old > > > StormKafka spout when the new one which just came "General Available" > > with > > > Storm 1.1.0, is not stable, and it's not better when we try it from > > current > > > 1.1.x builds to take into account recently closed JIRAs. > > > > > > We're even considering writing our own Kafka spout with Kafka 0.10.x > API > > to > > > overcome the incompatibility of the old StormKafka spout with Kafka > 0.10 > > > libraries. > > > > > > Thus, for people which are comfortable with old Kafka spout, I'like to > > give > > > a -1 (non binding) to the proposal of withdrawal of the old StormKafka > > > spout until the new one converges. > > > > > > Best regards, > > > Alexandre Vermeerbergen > > > > > > > > > 2017-06-28 19:40 GMT+02:00 P. Taylor Goetz <ptgo...@gmail.com>: > > > > > >> > > >>> On Jun 28, 2017, at 1:16 PM, Hugo Da Cruz Louro < > > hlo...@hortonworks.com> > > >> wrote: > > >>> > > >>> I still need to go over the entire discussion thread in more detail, > > but > > >> one thing I would like to bring up right way is the proposal to > > DEPRECATE, > > >> and eventually remove, the KafkaSpout with the old Kafka Consumer > APIs. > > The > > >> storm-kafka-client KafkaSpout is getting stabilized, and I think we > are > > all > > >> in agreement that the storm-kafka KafkaSpout has presented continuous > > >> maintainability problems with some fixes that got in not being > backwards > > >> compatible. > > >> > > >> I’m fine with deprecating the old KafkaSpout, but I feel the decision > to > > >> actually remove it needs to take into account the user community. The > > main > > >> sticking point here is compatibility with earlier versions of Kafka. > > Like > > >> with JDK versions, there are many valid reasons whey users may not be > > in a > > >> position to upgrade to a newer version of Kafka. Outright removal > could > > >> leave some users in the lurch. > > >> > > >> Ideally, we could just poll the user community to get an idea of how > > much > > >> of the user base depends on the old KafkaSpout and use the results to > > guide > > >> our decision. Unfortunately, at least in my past experience, polling > the > > >> user@ list doesn’t elicit much of a response and the results don’t > > >> provide an accurate view of the user community. > > >> > > >> > > >>> > > >>> I am pretty confident how things are looking at this point for the > > >> KafkaSpout. The Trident Kafka Spout is likely in between alpha and > beta, > > >> and that should be taken into account. I just recently submitted a PR< > > >> https://github.com/apache/storm/pull/2174> with some improvements to > > the > > >> Trident Kafka Spout (including the refactoring done to support manual > > >> partition assignment), and there are some customers using it in > > >> pre-production. However, it definitely would benefit from some more > > testing. > > >>> > > >>> Thanks, > > >>> Hugo > > >> > > >> -Taylor > > >> > > >> > > >>> > > >>> On Jun 28, 2017, at 7:48 AM, Bobby Evans <ev...@yahoo-inc.com.INVALID > < > > >> mailto:ev...@yahoo-inc.com.INVALID>> wrote: > > >>> > > >>> +1. > > >>> If the 1.1 and 1.2 lines start to become difficult to maintain we can > > >> look at putting them in maintenance mode too once we have a 2.x > release. > > >>> I am a little nervous about merging a new feature into 1.x branch > > >> without first going to master, but I hope that it will not be too much > > work > > >> to port it to master, and I trust the devs on that branch to do the > > right > > >> thing. > > >>> On a related note we have not done much with feature branches before > so > > >> I am not sure what we want to do about merging in the new metrics API > > >> branch to 1.x. I know for me I have not had time to keep up with the > > >> development work going on there. I would at least like to have a pull > > >> request put up for review before we merge it in. This would fit with > > our > > >> current bylaws that do not mention feature branches. If all of the > > changes > > >> have already followed the review process then technically I think it > is > > OK > > >> to just merge it in, but I still would like to take some time to look > at > > >> the changes, and especially the new APIs. > > >>> > > >>> - Bobby > > >>> > > >>> > > >>> On Wednesday, June 28, 2017, 1:53:34 AM CDT, Jungtaek Lim < > > >> kabh...@gmail.com<mailto:kabh...@gmail.com>> wrote: > > >>> > > >>> That's great news that metrics work is ready! > > >>> > > >>> I'm +1 to Taylor's proposal, but in order to respect semantic > > >> versioning, I > > >>> propose some modifications from Taylor's proposal: > > >>> > > >>> - create 1.1.x-branch with target version 1.1.1-SNAPSHOT and port > back > > >> only > > >>> bug fixes to the 1.1.x-branch > > >>> - change the target version of 1.x-branch to 1.2.0-SNAPSHOT > > >>> > > >>> If we also agree above, I would like to volunteer the back-port work. > > >>> > > >>> Thanks, > > >>> Jungtaek Lim (HeartSaVioR) > > >>> > > >>> 2017년 6월 28일 (수) 오전 10:09, Harsha <st...@harsha.io<mailto:storm@ > > >> harsha.io>>님이 작성: > > >>> > > >>> +1 for above stated approach on releasing 1.2.0 with metrics > > >>> -Harsha > > >>> > > >>> On Tue, Jun 27, 2017, at 12:17 PM, P. Taylor Goetz wrote: > > >>> The work on metrics is ready for a pull request to 1.x-branch from > the > > >>> feature branch. I’ve held off because we haven’t reached consensus > on a > > >>> path forward with the 1.x release lines . > > >>> > > >>> I’d like to propose the following for the 1.x line: > > >>> > > >>> 1. Create a branch for 1.2 so we have a branch to review the metrics > > >>> stuff. > > >>> 2. Release 1.1.1 > > >>> 3. Review/merge metrics work. Port metrics to master. > > >>> 4. Release 1.2.0 > > >>> 5. Put the entire 1.x line into maintenance mode. Drop support for > > 1.0.x. > > >>> (we would only support 1.2.x and 1.1.x which are very closely > aligned). > > >>> > > >>> Dropping support for 1.0.x line would eliminate the need to maintain > > one > > >>> of the fairly heavily diverged branches. The 1.2.x and 1.1.x would be > > >>> very closely aligned. I just up merged metrics_v2 against 1.x-branch > > >>> after a while, and there were no conflicts. > > >>> > > >>> That would give us a little more bandwidth to focus on 2.0 and needed > > bug > > >>> fixes to the 1.x line like some of the issues raised with > > >>> storm-kafka-client. We could even start releasing alpha/beta versions > > of > > >>> 2.0 in parallel to the steps above. > > >>> > > >>> Any thoughts on that approach? > > >>> > > >>> -Taylor > > >>> > > >>> > > >>> On Jun 24, 2017, at 1:21 AM, Jungtaek Lim <kabh...@gmail.com<mailto: > > kabh > > >> w...@gmail.com>> wrote: > > >>> > > >>> Yes I prefer option 1, but it might depend on the progress of metrics > > >>> V2. > > >>> If it can be done within predictable near future I'm OK to pick > option > > >>> 2, > > >>> but if not, we may be better to focus releasing 2.0.0 and make it > > >>> really > > >>> happen. > > >>> > > >>> Whichever we go, I feel it's time to track remaining work on Storm > > >>> 2.0.0. I > > >>> found some bugs on master branch so filed issues, and we've remaining > > >>> port > > >>> work (UI and logviewer). We've some other improvements target for > > >>> 2.0.0: > > >>> worker redesign, beam integration, and so on, and we don't track its > > >>> progress at all. I don't think we should wait for features which > > >>> progress > > >>> is not transparent (in other words we don't know when it will be > > >>> finished). > > >>> > > >>> - Jungtaek Lim (HeartSaVioR) > > >>> > > >>> 2017년 6월 24일 (토) 오전 5:19, P. Taylor Goetz <ptgo...@gmail.com<mailto: > > ptgo > > >> e...@gmail.com>>님이 작성: > > >>> > > >>> Bobby/Jungtaek, > > >>> > > >>> Are you saying you want to forego the 1.2 “metrics_v2” release and > > >>> include > > >>> it only in 2.0? (I ask because that work is already based on > > >>> 1.x-branch, > > >>> and forward-porting it to master is relatively simple.) I’d kind of > > >>> like > > >>> that work go out soon. > > >>> > > >>> If we go with option 1, I would want to see a 2.0 release (even if > > >>> it’s a > > >>> “beta” or “preview) before putting the 1.x line into maintenance > mode. > > >>> > > >>> -Taylor > > >>> > > >>> On Jun 23, 2017, at 9:51 AM, Bobby Evans <ev...@yahoo-inc.com.INVALID > < > > >> mailto:ev...@yahoo-inc.com.INVALID> > > >>> > > >>> wrote: > > >>> > > >>> I see 2 ways to address this. > > >>> 1) We put the 1.x line into maintenance mode like with 0.10. We > > >>> don't > > >>> backport anything except bug fixes.2) We backport a lot of the > > >>> backwards > > >>> compatible changes from 2.x to 1.x. > > >>> My personal preference is 1. It makes it clear the direction we > > >>> want to > > >>> go in. The biggest issue is that we probably would want to do a 2.x > > >>> release sooner rather then later. Even if we don't get all of the > > >>> features > > >>> that people want, if we just get a release out we can add in new > > >>> features > > >>> if they are backwards compatible, or we can create a 3.x line that > > >>> would > > >>> have the breaking changes in it. > > >>> > > >>> - Bobby > > >>> > > >>> > > >>> On Thursday, June 22, 2017, 7:39:55 PM CDT, Jungtaek Lim < > > >>> kabh...@gmail.com<mailto:kabh...@gmail.com>> wrote: > > >>> > > >>> I'd like to bump this again instead of initiating new discussion > > >>> thread. > > >>> > > >>> I had having hard time to create and apply pull requests for both > > >>> master > > >>> and 1.x-branch and that's really painful and sometimes blocker for > > >>> me to > > >>> do > > >>> merge step. > > >>> Two branches are heavily diverged more than between 0.10 and 1.0.0, > > >>> even > > >>> IDE can't switch between the branch smoothly. We didn't even address > > >>> checkstyle issue yet, but after addressing, it could be "completely" > > >>> diverged. JDK version is another major issue, since the pull requests > > >>> targeted for master branch are not checked against JDK 7, and some of > > >>> them > > >>> make some issues regarding JDK version while porting back. > > >>> > > >>> So personally I really would like to see the plan for 1.x version > > >>> line > > >>> changed - skipping any minor releases including 1.2.0 - and have epic > > >>> issue > > >>> for 2.0.0 and just go ahead. That was our proposed plan indeed. (even > > >>> proposed plan was having 2.0.0 directly after 1.0.0) > > >>> > > >>> Would like to hear everyone's opinions. If we have consensus to not > > >>> having > > >>> any minor releases for 1.x version line, I will not port back > > >>> non-bugfix > > >>> pull requests to 1.x-branch, and guide contributors to create pull > > >>> requests > > >>> against master branch, not 1.x version line. > > >>> > > >>> Thanks, > > >>> Jungtaek Lim (HeartSaVioR) > > >>> > > >>> 2017년 6월 4일 (일) 오전 1:17, Alexandre Vermeerbergen < > > >>> avermeerber...@gmail.com<mailto:avermeerber...@gmail.com>>님이 > > >>> 작성: > > >>> > > >>> +1 for Roshan's suggestion : in our Storm 1.x based supervision > > >>> system, > > >>> we're very interested anything that can provide better throughput. > > >>> > > >>> 2017-06-03 18:12 GMT+02:00 Roshan Naik <ros...@hortonworks.com > <mailto: > > >> ros...@hortonworks.com>>: > > >>> > > >>> For 2.0 beta … it would be good to incorporate some of the Worker > > >>> improvements (STORM-2284) IMO. Changes to messaging subsystem can > > >>> be > > >>> delivered sooner and my in-progress implementation suggests that it > > >>> will > > >>> yield substantial latency improvements. The 2.0 beta phase would > > >>> really > > >>> help kick the tires on the revised messaging system and the > > >>> performance > > >>> improvements will also be a good incentive for trying out the 2.0 > > >>> line. > > >>> > > >>> I notice multiple other bottlenecks that are holding back > > >>> throughput a > > >>> lot, which can be addressed in a subsequent 2.x minor release. > > >>> -roshan > > >>> > > >>> > > >>> On 6/3/17, 7:20 AM, "Jungtaek Lim" <kabh...@gmail.com<mailto:kabh > > >> w...@gmail.com>> wrote: > > >>> > > >>> I also would love to see metrics V2 code sooner or later too. > > >>> If we > > >>> can get > > >>> it before releasing 2.0.0 that will be great, and then maybe we > > >>> could > > >>> just > > >>> move toward to 2.0.0, not adding any improvements to 1.x version > > >>> line. > > >>> (And that's what I would want to.) > > >>> > > >>> If we would really want to have 1.2.0, I suggest that we make > > >>> the > > >>> 1.1.1 > > >>> version correct right now rather than after releasing 1.1.1. We > > >>> also > > >>> merged > > >>> non-bugfix things to 1.x-branch but that's not what users > > >>> expect. I > > >>> agree > > >>> that work may be painful, but anyway need to do it. > > >>> > > >>> - Jungtaek Lim (HeartSaVioR) > > >>> > > >>> 2017년 6월 3일 (토) 오전 3:49, Bobby Evans > > >>> <ev...@yahoo-inc.com.invalid<mailto:ev...@yahoo-inc.com.invalid> > > >>> 님이 > > >>> 작성: > > >>> > > >>> I would love to see the metrics V2 code come out sooner rather > > >>> than > > >>> later. +1. > > >>> My biggest blocker for a 1.x release is > > >>> https://github.com/apache/storm/pull/2142 Even though the pull > > >>> request > > >>> says it is minor it showed that we messed up pushing back some > > >>> changes for > > >>> pacemaker to open source (the code does not run at all which for > > >>> me > > >>> is a > > >>> blocker) and I really want to get that fully fixed/tested before > > >>> another > > >>> release. > > >>> As for 2.x I think we are very close to being able to so a 2.x > > >>> alpha > > >>> release. I would like to see metrics V2 merged in simply because > > >>> it > > >>> is a > > >>> big change for user facing APIs. But after that I would love to > > >>> see > > >>> us > > >>> starting to push forward on getting that out. > > >>> > > >>> > > >>> - Bobby > > >>> > > >>> > > >>> On Friday, June 2, 2017, 1:39:46 PM CDT, P. Taylor Goetz < > > >>> ptgo...@gmail.com<mailto:ptgo...@gmail.com>> wrote: > > >>> > > >>> I’d like to bump this thread and start a discussion around our > > >>> next > > >>> release. Here are my thoughts. > > >>> > > >>> There are a number of important fixes in 1.x-branch so I’d like > > >>> to > > >>> consider releasing 1.1.1 soon. I’d appreciate input on any open > > >>> issues that > > >>> should be resolved for that release. > > >>> > > >>> I’d like us to consider releasing the metrics improvements in > > >>> STORM-2153 > > >>> [1] as version 1.2.0. That work is in the metrics_v2 feature > > >>> branch > > >>> right > > >>> now and would need to be reviewed and merged. That work is > > >>> against > > >>> the > > >>> 1.x-branch right now. I would recommend porting it to master > > >>> *after* > > >>> the > > >>> review/merge since there will likely be changes as a result of > > >>> the > > >>> review. > > >>> > > >>> Maybe related to or not, but would we want to create a new > > >>> branch > > >>> "1.1.x-branch", and make "1.x-branch" target for 1.2? > > >>> > > >>> > > >>> > > >>> If wee agree to the above, I would say yes. After the 1.1.1 > > >>> release, > > >>> we > > >>> could create a 1.1.x-branch that would be the maintenance/release > > >>> branch > > >>> for that version line. 1.x-branch would then become the target > > >>> for > > >>> the > > >>> 1.2.0 release. > > >>> > > >>> There are a few fixes in the 0.10.x branch that probably warrant > > >>> a > > >>> release. After that we may want to back away from that version > > >>> line > > >>> a bit > > >>> so we can focus more on newer versions. > > >>> > > >>> In the past, we’ve shied away form doing “beta” releases, but I’m > > >>> wondering if we might want to revisit that for the 2.0 release — > > >>> the > > >>> idea > > >>> being that it would give early adopter users a chance to kick the > > >>> tires on > > >>> what’s coming in 2.0 and provide feedback, find bugs, etc. to > > >>> help > > >>> make the > > >>> final release more solid. I’m on the fence here and could go > > >>> either > > >>> way. > > >>> > > >>> I’d appreciate any input others may have. > > >>> > > >>> > > >>> Thanks, > > >>> > > >>> -Taylor > > >>> > > >>> > > >>> [1] https://issues.apache.org/jira/browse/STORM-2153 > > >>> > > >>> > > >>> > > >>> On Mar 30, 2017, at 9:09 PM, Jungtaek Lim <kabh...@gmail.com<mailto: > > kabh > > >> w...@gmail.com>> > > >>> wrote: > > >>> > > >>> Maybe related to or not, but would we want to create a new > > >>> branch > > >>> "1.1.x-branch", and make "1.x-branch" target for 1.2? > > >>> > > >>> I'm not clear we don't release 1.2 for moving toward to 2.0.0, > > >>> so > > >>> hence > > >>> the > > >>> question. > > >>> > > >>> - Jungtaek Lim (HeartSaVioR) > > >>> > > >>> 2017년 3월 29일 (수) 오전 1:56, Hugo Da Cruz Louro < > > >>> hlo...@hortonworks.com<mailto:hlo...@hortonworks.com>>님이 > > >>> 작성: > > >>> > > >>> +1 for finishing the porting to Java ahead of anything else - > > >>> it > > >>> will > > >>> be a > > >>> significant milestone. I have a JIRA assigned concerning to > > >>> the > > >>> porting. I > > >>> will work on it for the 2.0 release. > > >>> > > >>> it’s a priority to guarantee no performance regressions. As > > >>> part > > >>> of this > > >>> endeavor, explore an automated (or easy) way to run and assert > > >>> major > > >>> performance benchmarks. Ideally any contributor should be able > > >>> to > > >>> fairly > > >>> easily test the impact of changes under certain performance > > >>> test > > >>> scenarios. > > >>> > > >>> Beam Runner work should take into account the impact of > > >>> incorporating > > >>> new > > >>> JStorm features and Storm Worker Redesign< > > >>> https://issues.apache.org/jira/browse/STORM-2284>. Not very > > >>> efficient > > >>> to > > >>> start doing it, to find out that it will have to chance in > > >>> face > > >>> of > > >>> Storm > > >>> and worker redesign. That is, it should be done after it’s > > >>> building > > >>> blocks > > >>> are stable. > > >>> > > >>> Thanks, > > >>> Hugo > > >>> > > >>> On Mar 24, 2017, at 12:07 AM, Arun Mahadevan < > > >>> ar...@apache.org<mailto:ar...@apache.org> > > >>> <mailto: > > >>> ar...@apache.org>> wrote: > > >>> > > >>> +1 to release with the porting completed. I think its mainly > > >>> the > > >>> UI > > >>> server > > >>> and log viewer that’s pending. > > >>> > > >>> We can start doing the regression and performance tests for > > >>> whatever is > > >>> already ported. > > >>> > > >>> If anyone is running the master branch in their pre-prod / > > >>> prod > > >>> environments, it will be good to know and give us more > > >>> confidence. > > >>> > > >>> The other features can be added in follow up releases. > > >>> > > >>> Regards, > > >>> Arun > > >>> > > >>> > > >>> On 3/24/17, 11:47 AM, "Satish Duggana" < > > >>> satish.dugg...@gmail.com > > >>> <mailto: > > >>> satish.dugg...@gmail.com>> wrote: > > >>> > > >>> +1 to have 2.0 with porting and performance(it should be at > > >>> least > > >>> as > > >>> good > > >>> as 1.x release) issues addressed > > >>> > > >>> We can target other tasks(mentioned by Taylor and Jungtaek) > > >>> for > > >>> 2.x-branch. > > >>> > > >>> > > >>> Exactly-once support: > > >>> While thinking through the exactlyonce support design, it is > > >>> realized > > >>> better to avoid acking tuples and implement exactly once by > > >>> snapshotting > > >>> barriers. It seems JStorm folks followed similar design, they > > >>> claim it > > >>> gives better performance. This feature is essential for beam > > >>> runner and > > >>> we > > >>> can decide on respective approaches though. > > >>> > > >>> Beam Runner > > >>> Lets hold on this for now and keep it in Storm till 2.x. We > > >>> should avoid > > >>> having a minimal beam runner in haste. It is better to address > > >>> STORM-2284, > > >>> exactly-once and other windowing enhancements to enable beam > > >>> runner. > > >>> > > >>> JStorm > > >>> Agree with Jungtaek on looking at the latest JStorm and > > >>> align/scope with > > >>> the features for 2.x. > > >>> > > >>> STORM-2284 > > >>> We may want to look at JStorm worker before working on > > >>> respective > > >>> components in this epic to pull appropriate enhancements. > > >>> > > >>> YARN/MESOS > > >>> Supporting Storm on YARN/Mesos for 2.x. > > >>> > > >>> Thanks, > > >>> Satish. > > >>> > > >>> > > >>> On Fri, Mar 24, 2017 at 9:09 AM, Jungtaek Lim < > > >>> kabh...@gmail.com > > >>> <mailto: > > >>> kabh...@gmail.com>> wrote: > > >>> > > >>> First of all, +1 to complete only port work and do sanity > > >>> check > > >>> (including > > >>> performance regression), and release. > > >>> > > >>> If we can get STORM-2284 within deterministic time frame (say > > >>> 2~3 > > >>> months) > > >>> that should be great, but if not I'd in favor of postponing > > >>> that > > >>> to > > >>> later > > >>> 2.x release. > > >>> > > >>> JStorm released their new versions after code donation. So > > >>> there're more > > >>> things we could get ideas from, or even adopt from. > > >>> https://github.com/alibaba/jstorm/blob/master/history.md > > >>> As you noticed from release note link, we also need to update > > >>> phase 2 > > >>> since > > >>> they already changed what we're planning to do in phase 2. For > > >>> example, > > >>> they changed backpressure to end-to-end, and changed to use > > >>> snapshot > > >>> rather > > >>> than acker. > > >>> May be sure, JStorm pulled many features from today's Storm, > > >>> like > > >>> Flux, > > >>> Windowing, more shuffle groupings, log search, log level > > >>> change, > > >>> and so > > >>> on. > > >>> > > >>> STORM-2426 <https://issues.apache.org/jira/browse/STORM-2426> > > >>> is > > >>> due to > > >>> the > > >>> limitation of Spout lifecycle (all the things are done in > > >>> single > > >>> thread), > > >>> and STORM-1358 < > > >>> https://issues.apache.org/jira/browse/STORM-1358 > > >>> (JStorm's > > >>> multi-thread Spout) can remedy this (despite that Spout > > >>> implementation > > >>> may > > >>> need to guarantee thread-safety later). It's not a just > > >>> improvement but > > >>> close to design concern so would like to address sooner than > > >>> other > > >>> things > > >>> in phase 2. > > >>> > > >>> For Storm SQL side, I've lost progress but major work would be > > >>> adopting > > >>> group by with windowing. It was not available from Calcite but > > >>> will be > > >>> available at next release (1.12.0). > > >>> I've filed this to STORM-2405 > > >>> <https://issues.apache.org/jira/browse/STORM-2405>, but > > >>> windowing & > > >>> micro > > >>> batch is not intuitive, so I would like to change the > > >>> underlying > > >>> API to > > >>> stream API in SQL. Also filed this to STORM-2406 > > >>> <https://issues.apache.org/jira/browse/STORM-2406>. > > >>> > > >>> Just 2 cents btw, hopefully I would like to see metrics V2 > > >>> sooner > > >>> since > > >>> we > > >>> lost metrics even when doing normal operation like restarting > > >>> worker, > > >>> rebalancing, and so on. Eventually we need to fight with > > >>> dynamic > > >>> scaling, > > >>> and then metrics will be broken often. > > >>> > > >>> Thanks, > > >>> Jungtaek Lim (HeartSaVioR) > > >>> > > >>> 2017년 3월 24일 (금) 오전 5:05, Harsha Chintalapani < > > >>> st...@harsha.io > > >>> 님이 > > >>> 작성: > > >>> > > >>> Storm 2.0 migration to java in itself is a big win and would > > >>> attract > > >>> wider > > >>> community and adoption. So my vote would be to resolve the > > >>> first > > >>> 3 items > > >>> to > > >>> get a release out. > > >>> All the other featured mentioned are great to have but > > >>> shouldn't > > >>> be > > >>> blockers for 2.0 release. > > >>> > > >>> -Harsha > > >>> > > >>> On Thu, Mar 23, 2017 at 11:51 AM P. Taylor Goetz < > > >>> ptgo...@gmail.com> > > >>> wrote: > > >>> > > >>> With the 1.1.0 release nearing completion, I’d like to turn > > >>> our > > >>> attention > > >>> to 2.0 and develop a plan for what features, etc. to include. > > >>> > > >>> The following 3 are what I feel are the minimum for a 2.0 > > >>> release. > > >>> These > > >>> could likely be resolved relatively quickly: > > >>> > > >>> * Performance — I’ve not benchmarked the master branch vs. > > >>> 1.0.x > > >>> or > > >>> 1.1.x > > >>> in a while, but I feel it will be important to make sure there > > >>> are no > > >>> performance regressions, and would hope that we actually have > > >>> a > > >>> performance > > >>> improvement over previous versions. To that end (e.g. if there > > >>> is > > >>> in > > >>> fact a > > >>> performance regression), the proposals that Roshan Naik put > > >>> together > > >>> for > > >>> revising the threading and execution model (STORM-2307) and > > >>> replacing > > >>> Disruptor with JCTools (STORM-2306) warrant review and > > >>> consideration. > > >>> See > > >>> also STORM-2284 which is the parent JIRA. > > >>> > > >>> * Finish porting Storm UI to java (STORM-1311) > > >>> > > >>> * Finish porting log viewer to java (STORM-1280) > > >>> > > >>> The following are items that are nice to have in 2.0, but I > > >>> don’t > > >>> feel > > >>> are > > >>> absolutely necessary for an initial 2.0 release: > > >>> > > >>> * Beam Runner (I wouldn’t tie this to 2.0, mentioning it > > >>> because > > >>> it’s > > >>> relevant) — Initially there seemed to be a lot of interest in > > >>> this, but > > >>> that seems to have trailed off. I spoke with some Beam > > >>> developers > > >>> and > > >>> there > > >>> seems to be interest from that community as well. Do we want > > >>> to > > >>> move > > >>> that > > >>> effort to the Beam community, or keep it here? Moving it to > > >>> the > > >>> Beam > > >>> community might lead to better collaboration between projects. > > >>> > > >>> * Bounded Spouts (needed for Beam Runner implementation) — > > >>> Currently > > >>> spouts are unbounded, there no end to the stream. Beam has the > > >>> concept > > >>> of > > >>> bounded sources (roughly analogous to batch processing). To > > >>> support > > >>> that, > > >>> we would need to implement a similar concept in Storm. One > > >>> benefit of > > >>> such > > >>> a feature would be the ability to handle both bounded and > > >>> unbounded > > >>> workflows in Storm. > > >>> > > >>> * Storm-SQL — Jungtaek/Xin: You have been the primary drivers > > >>> behind > > >>> this > > >>> effort. What improvements do you envision for 2.0? > > >>> > > >>> * Metrics V2 (STORM-2153: Coda Hale Metrics) — I’ve been > > >>> targeting this > > >>> for 1.2.0, but it’s designed to be easily portable to > > >>> master/2.0. > > >>> > > >>> * JStorm Migration — Original outline can be found here [1]. > > >>> Note > > >>> a lot > > >>> of > > >>> the associated JIRAs below are assigned, but there hasn’t been > > >>> any > > >>> recent > > >>> activity or pull requests, we should probably consider them > > >>> unassigned > > >>> and > > >>> up for grabs.: > > >>> > > >>> * Worker Classloader Isolation (STORM-1338) — Lack of this has > > >>> been the > > >>> bane of a lot of Storm users almost since day one. We have > > >>> largely > > >>> addressed it by shading/relocating dependencies. It would be > > >>> great to > > >>> see > > >>> this addressed once and for all. > > >>> > > >>> * JStorm back pressure implementation (STORM-1324) — The > > >>> current > > >>> back > > >>> pressure implementation leaves a bit to be desired, and the > > >>> JStorm > > >>> approach > > >>> looks promising, though it also depends on the JStorm concept > > >>> of > > >>> “topology > > >>> master” (STORM-1323), which may have some implications > > >>> regarding > > >>> security. > > >>> > > >>> * Dynamic Topology Updates (STORM-1335) — This would provide a > > >>> command > > >>> to > > >>> update topology jars and configuration without stopping the > > >>> topology, > > >>> and > > >>> is well suited to leverage the blobstore. The restart command > > >>> (that can > > >>> also update the topology configuration) also looks compelling > > >>> (STORM-1334). > > >>> > > >>> * Additional Scheduler Implementations (STORM-1320) > > >>> > > >>> * Additional Grouping Implementations (STORM-1328) > > >>> > > >>> > > >>> As always I’m open to any opinions and suggestions. > > >>> > > >>> -Taylor > > >>> > > >>> [1] > > >>> > > >>> https://cwiki.apache.org/confluence/pages/viewpage. > > >>> action?pageId=61328109 > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >> > > >> > > > > >