Hi Bryan, Excellent questions about the upcoming 2.0! Took me a while to find the answer about structured streaming.
Seen http://people.apache.org/~pwendell/spark-nightly/spark-master-docs/latest/structured-streaming-programming-guide.html#window-operations-on-event-time ? That may be relevant to your question 2. Pozdrawiam, Jacek Laskowski ---- https://medium.com/@jaceklaskowski/ Mastering Apache Spark http://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Mon, Jul 25, 2016 at 8:23 PM, Bryan Jeffrey <bryan.jeff...@gmail.com> wrote: > All, > > I had three questions: > > (1) Is there a timeline for stable Spark 2.0 release? I know the 'preview' > build is out there, but was curious what the timeline was for full release. > Jira seems to indicate that there should be a release 7/27. > > (2) For 'continuous' datasets there has been a lot of discussion. One item > that came up in tickets was the idea that 'count()' and other functions do > not apply to continuous datasets: > https://github.com/apache/spark/pull/12080. In this case what is the > intended procedure to calculate a streaming statistic based on an interval > (e.g. count the number of records in a 2 minute window every 2 minutes)? > > (3) In previous releases (1.6.1) the call to DStream / RDD repartition w/ a > number of partitions set to zero silently deletes data. I have looked in > Jira for a similar issue, but I do not see one. I would like to address > this (and would likely be willing to go fix it myself). Should I just > create a ticket? > > Thank you, > > Bryan Jeffrey > --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org