+1 to have 2.0 with porting and performance(it should be at least as good
as 1.x release) issues addressed

We can target other tasks(mentioned by Taylor and Jungtaek) for 2.x-branch.


Exactly-once support:
While thinking through the exactlyonce support design, it is realized
better to avoid acking tuples and implement exactly once by snapshotting
barriers. It seems JStorm folks followed similar design, they claim it
gives better performance. This feature is essential for beam runner and we
can decide on respective approaches though.

Beam Runner
Lets hold on this for now and keep it in Storm till 2.x. We should avoid
having a minimal beam runner in haste. It is better to address STORM-2284,
exactly-once and other windowing enhancements to enable beam runner.

JStorm
Agree with Jungtaek on looking at the latest JStorm and align/scope with
the features for 2.x.

STORM-2284
We may want to look at JStorm worker before working on respective
components in this epic to pull appropriate enhancements.

YARN/MESOS
Supporting Storm on YARN/Mesos for 2.x.

Thanks,
Satish.


On Fri, Mar 24, 2017 at 9:09 AM, Jungtaek Lim <kabh...@gmail.com> wrote:

> First of all, +1 to complete only port work and do sanity check (including
> performance regression), and release.
>
> If we can get STORM-2284 within deterministic time frame (say 2~3 months)
> that should be great, but if not I'd in favor of postponing that to later
> 2.x release.
>
> JStorm released their new versions after code donation. So there're more
> things we could get ideas from, or even adopt from.
> https://github.com/alibaba/jstorm/blob/master/history.md
> As you noticed from release note link, we also need to update phase 2 since
> they already changed what we're planning to do in phase 2. For example,
> they changed backpressure to end-to-end, and changed to use snapshot rather
> than acker.
> May be sure, JStorm pulled many features from today's Storm, like Flux,
> Windowing, more shuffle groupings, log search, log level change, and so on.
>
> STORM-2426 <https://issues.apache.org/jira/browse/STORM-2426> is due to
> the
> limitation of Spout lifecycle (all the things are done in single thread),
> and STORM-1358 <https://issues.apache.org/jira/browse/STORM-1358>(JStorm's
> multi-thread Spout) can remedy this (despite that Spout implementation may
> need to guarantee thread-safety later). It's not a just improvement but
> close to design concern so would like to address sooner than other things
> in phase 2.
>
> For Storm SQL side, I've lost progress but major work would be adopting
> group by with windowing. It was not available from Calcite but will be
> available at next release (1.12.0).
> I've filed this to STORM-2405
> <https://issues.apache.org/jira/browse/STORM-2405>, but windowing & micro
> batch is not intuitive, so I would like to change the underlying API to
> stream API in SQL. Also filed this to STORM-2406
> <https://issues.apache.org/jira/browse/STORM-2406>.
>
> Just 2 cents btw, hopefully I would like to see metrics V2 sooner since we
> lost metrics even when doing normal operation like restarting worker,
> rebalancing, and so on. Eventually we need to fight with dynamic scaling,
> and then metrics will be broken often.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 2017년 3월 24일 (금) 오전 5:05, Harsha Chintalapani <st...@harsha.io>님이 작성:
>
> > Storm 2.0 migration to java in itself is a big win and would attract
> wider
> > community and adoption. So my vote would be to resolve the first 3 items
> to
> > get a release out.
> > All the other featured mentioned are great to have but shouldn't be
> > blockers for 2.0 release.
> >
> > -Harsha
> >
> > On Thu, Mar 23, 2017 at 11:51 AM P. Taylor Goetz <ptgo...@gmail.com>
> > wrote:
> >
> > > With the 1.1.0 release nearing completion, I’d like to turn our
> attention
> > > to 2.0 and develop a plan for what features, etc. to include.
> > >
> > > The following 3 are what I feel are the minimum for a 2.0 release.
> These
> > > could likely be resolved relatively quickly:
> > >
> > > * Performance — I’ve not benchmarked the master branch vs. 1.0.x or
> 1.1.x
> > > in a while, but I feel it will be important to make sure there are no
> > > performance regressions, and would hope that we actually have a
> > performance
> > > improvement over previous versions. To that end (e.g. if there is in
> > fact a
> > > performance regression), the proposals that Roshan Naik put together
> for
> > > revising the threading and execution model (STORM-2307) and replacing
> > > Disruptor with JCTools (STORM-2306) warrant review and consideration.
> See
> > > also STORM-2284 which is the parent JIRA.
> > >
> > > * Finish porting Storm UI to java (STORM-1311)
> > >
> > > * Finish porting log viewer to java (STORM-1280)
> > >
> > > The following are items that are nice to have in 2.0, but I don’t feel
> > are
> > > absolutely necessary for an initial 2.0 release:
> > >
> > > * Beam Runner (I wouldn’t tie this to 2.0, mentioning it because it’s
> > > relevant) — Initially there seemed to be a lot of interest in this, but
> > > that seems to have trailed off. I spoke with some Beam developers and
> > there
> > > seems to be interest from that community as well. Do we want to move
> that
> > > effort to the Beam community, or keep it here? Moving it to the Beam
> > > community might lead to better collaboration between projects.
> > >
> > > * Bounded Spouts (needed for Beam Runner implementation) — Currently
> > > spouts are unbounded, there no end to the stream. Beam has the concept
> of
> > > bounded sources (roughly analogous to batch processing). To support
> that,
> > > we would need to implement a similar concept in Storm. One benefit of
> > such
> > > a feature would be the ability to handle both bounded and unbounded
> > > workflows in Storm.
> > >
> > > * Storm-SQL — Jungtaek/Xin: You have been the primary drivers behind
> this
> > > effort. What improvements do you envision for 2.0?
> > >
> > > * Metrics V2 (STORM-2153: Coda Hale Metrics) — I’ve been targeting this
> > > for 1.2.0, but it’s designed to be easily portable to master/2.0.
> > >
> > > * JStorm Migration — Original outline can be found here [1]. Note a lot
> > of
> > > the associated JIRAs below are assigned, but there hasn’t been any
> recent
> > > activity or pull requests, we should probably consider them unassigned
> > and
> > > up for grabs.:
> > >
> > > * Worker Classloader Isolation (STORM-1338) — Lack of this has been the
> > > bane of a lot of Storm users almost since day one. We have largely
> > > addressed it by shading/relocating dependencies. It would be great to
> see
> > > this addressed once and for all.
> > >
> > > * JStorm back pressure implementation (STORM-1324) — The current back
> > > pressure implementation leaves a bit to be desired, and the JStorm
> > approach
> > > looks promising, though it also depends on the JStorm concept of
> > “topology
> > > master” (STORM-1323), which may have some implications regarding
> > security.
> > >
> > > * Dynamic Topology Updates (STORM-1335) — This would provide a command
> to
> > > update topology jars and configuration without stopping the topology,
> and
> > > is well suited to leverage the blobstore. The restart command (that can
> > > also update the topology configuration) also looks compelling
> > (STORM-1334).
> > >
> > > * Additional Scheduler Implementations (STORM-1320)
> > >
> > > * Additional Grouping Implementations (STORM-1328)
> > >
> > >
> > > As always I’m open to any opinions and suggestions.
> > >
> > > -Taylor
> > >
> > > [1]
> > >
> > https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=61328109
> > >
> > >
> > >
> >
>

Reply via email to