Re: StructuredStreaming status

Michael Armbrust Wed, 19 Oct 2016 13:31:16 -0700

I know people are seriously thinking about latency.  So far that has not
been the limiting factor in the users I've been working with.


On Wed, Oct 19, 2016 at 1:11 PM, Cody Koeninger <c...@koeninger.org> wrote:

> Is anyone seriously thinking about alternatives to microbatches?
>
> On Wed, Oct 19, 2016 at 2:45 PM, Michael Armbrust
> <mich...@databricks.com> wrote:
> > Anything that is actively being designed should be in JIRA, and it seems
> > like you found most of it.  In general, release windows can be found on
> the
> > wiki.
> >
> > 2.1 has a lot of stability fixes as well as the kafka support you
> mentioned.
> > It may also include some of the following.
> >
> > The items I'd like to start thinking about next are:
> >  - Evicting state from the store based on event time watermarks
> >  - Sessionization (grouping together related events by key / eventTime)
> >  - Improvements to the query planner (remove some of the restrictions on
> > what queries can be run).
> >
> > This is roughly in order based on what I've been hearing users hit the
> most.
> > Would love more feedback on what is blocking real use cases.
> >
> > On Tue, Oct 18, 2016 at 1:51 AM, Ofir Manor <ofir.ma...@equalum.io>
> wrote:
> >>
> >> Hi,
> >> I hope it is the right forum.
> >> I am looking for some information of what to expect from
> >> StructuredStreaming in its next releases to help me choose when / where
> to
> >> start using it more seriously (or where to invest in workarounds and
> where
> >> to wait). I couldn't find a good place where such planning discussed
> for 2.1
> >> (like, for example ML and SPARK-15581).
> >> I'm aware of the 2.0 documented limits
> >> (http://spark.apache.org/docs/2.0.1/structured-streaming-
> programming-guide.html#unsupported-operations),
> >> like no support for multiple aggregations levels, joins are strictly to
> a
> >> static dataset (no SCD or stream-stream) etc, limited sources / sinks
> (like
> >> no sink for interactive queries) etc etc
> >> I'm also aware of some changes that have landed in master, like the new
> >> Kafka 0.10 source (and its on-going improvements) in SPARK-15406, the
> >> metrics in SPARK-17731, and some improvements for the file source.
> >> If I remember correctly, the discussion on Spark release cadence
> concluded
> >> with a preference to a four-month cycles, with likely code freeze pretty
> >> soon (end of October). So I believe the scope for 2.1 should likely
> quite
> >> clear to some, and that 2.2 planning should likely be starting about
> now.
> >> Any visibility / sharing will be highly appreciated!
> >> thanks in advance,
> >>
> >> Ofir Manor
> >>
> >> Co-Founder & CTO | Equalum
> >>
> >> Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io
> >
> >
>

Re: StructuredStreaming status

Reply via email to