Number of goals to win championship

2015-02-05 Thread jvuillermet
I want to find the minimum number of goals for a player that likely allows its team to win the championship. My data : goals win/loose 25 1 5 0 10 1 20 0 After some reading and courses, I think I need a Logistic Regression model to get those datas. I create my LabeledPoint with those data (1/

Re: SQL JSON array operations

2015-01-15 Thread jvuillermet
yeah that's where I ended up. Thanks ! I'll give it a try. On Thu, Jan 15, 2015 at 8:46 PM, Ayoub [via Apache Spark User List] < ml-node+s1001560n21172...@n3.nabble.com> wrote: > You could try to use hive context which bring HiveQL, it would allow you > to query nested structures using "LATERAL V

SQL JSON array operations

2015-01-15 Thread jvuillermet
let's say my json file lines looks like this {"user": "baz", "tags" : ["foo", "bar"] } sqlContext.jsonFile("data.json") ... How could I query for user with "bar" tags using SQL sqlContext.sql("select user from users where tags ?contains? 'bar' ") I could simplify the request and use the re