I know people are seriously thinking about latency. So far that has not been the limiting factor in the users I've been working with.
On Wed, Oct 19, 2016 at 1:11 PM, Cody Koeninger <c...@koeninger.org> wrote: > Is anyone seriously thinking about alternatives to microbatches? > > On Wed, Oct 19, 2016 at 2:45 PM, Michael Armbrust > <mich...@databricks.com> wrote: > > Anything that is actively being designed should be in JIRA, and it seems > > like you found most of it. In general, release windows can be found on > the > > wiki. > > > > 2.1 has a lot of stability fixes as well as the kafka support you > mentioned. > > It may also include some of the following. > > > > The items I'd like to start thinking about next are: > > - Evicting state from the store based on event time watermarks > > - Sessionization (grouping together related events by key / eventTime) > > - Improvements to the query planner (remove some of the restrictions on > > what queries can be run). > > > > This is roughly in order based on what I've been hearing users hit the > most. > > Would love more feedback on what is blocking real use cases. > > > > On Tue, Oct 18, 2016 at 1:51 AM, Ofir Manor <ofir.ma...@equalum.io> > wrote: > >> > >> Hi, > >> I hope it is the right forum. > >> I am looking for some information of what to expect from > >> StructuredStreaming in its next releases to help me choose when / where > to > >> start using it more seriously (or where to invest in workarounds and > where > >> to wait). I couldn't find a good place where such planning discussed > for 2.1 > >> (like, for example ML and SPARK-15581). > >> I'm aware of the 2.0 documented limits > >> (http://spark.apache.org/docs/2.0.1/structured-streaming- > programming-guide.html#unsupported-operations), > >> like no support for multiple aggregations levels, joins are strictly to > a > >> static dataset (no SCD or stream-stream) etc, limited sources / sinks > (like > >> no sink for interactive queries) etc etc > >> I'm also aware of some changes that have landed in master, like the new > >> Kafka 0.10 source (and its on-going improvements) in SPARK-15406, the > >> metrics in SPARK-17731, and some improvements for the file source. > >> If I remember correctly, the discussion on Spark release cadence > concluded > >> with a preference to a four-month cycles, with likely code freeze pretty > >> soon (end of October). So I believe the scope for 2.1 should likely > quite > >> clear to some, and that 2.2 planning should likely be starting about > now. > >> Any visibility / sharing will be highly appreciated! > >> thanks in advance, > >> > >> Ofir Manor > >> > >> Co-Founder & CTO | Equalum > >> > >> Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io > > > > >