Re: [VOTE] Apache Spark 2.1.0 (RC1)

2016-12-01 Thread Robert Kruszewski
-1 since https://issues.apache.org/jira/browse/SPARK-17213 is a correctness regression from 2.0 release. The commit that caused it is 776d183c82b424ef7c3cae30537d8afe9b9eee83. Robert From: Reynold Xin Date: Tuesday, November 29, 2016 at 1:25 AM To:

Re: [SPARK-17845] [SQL][PYTHON] More self-evident window function frame boundary API

2016-12-01 Thread Maciej Szymkiewicz
It could be something like this https://github.com/zero323/spark/commit/b1f4d8218629b56b0982ee58f5b93a40305985e0 but I am not fully satisfied. On 11/30/2016 07:34 PM, Reynold Xin wrote: > Yes I'd define unboundedPreceding to -sys.maxsize, but also any value > less than min(-sys.maxsize,

unsubscribe

2016-12-01 Thread Vishal Soni

Re: [SPARK-17845] [SQL][PYTHON] More self-evident window function frame boundary API

2016-12-01 Thread Reynold Xin
Can you submit a pull request with test cases based on that change? On Dec 1, 2016, 9:39 AM -0800, Maciej Szymkiewicz , wrote: > This doesn't affect that. The only concern is what we consider to UNBOUNDED > on Python side. > > On 12/01/2016 07:56 AM, assaf.mendelson

Re: [SPARK-17845] [SQL][PYTHON] More self-evident window function frame boundary API

2016-12-01 Thread Maciej Szymkiewicz
This doesn't affect that. The only concern is what we consider to UNBOUNDED on Python side. On 12/01/2016 07:56 AM, assaf.mendelson wrote: > > I may be mistaken but if I remember correctly spark behaves > differently when it is bounded in the past and when it is not. > Specifically I seem to

Re: REST api for monitoring Spark Streaming

2016-12-01 Thread Chan Chor Pang
hi everyone I have done the coding and create the PR the implementation is straightforward and similar to the api in spark-core but we still need someone with streaming background to verify the patch just to make sure everything is OK so, please anyone can help?

Hidden Markov Model or Bayes Networks in Spark - MS Thesis theme

2016-12-01 Thread Alex153
As part of my MS Thesis (in computer science) project I am looking for chance to implement some machine learning or data mining algorithms. Are there good ideas for this - are there some unrealised algorithms that can be great contribution to the project? I am thinking about Hidden Markov Models