[no subject]

2015-12-01 Thread Alexander Pivovarov

query on SVD++

2015-12-01 Thread 张志强(旺轩)
Hi All, I came across the SVD++ algorithm implementation in Spark code base, but I was wondering why we didn't expose the scala api interface to python? Any plan to do this? BR, -Allen Zhang

Re: Problem in running MLlib SVM

2015-12-01 Thread Robert Dodier
Tarek, On looking at the code in SVM.scala, I see that SVMWithSGD.predictPoint first computes dot(w, x) + b where w is the SVM weight vector, x is the input vector, and b is a constant. If there is a threshold defined, then the output is 1 if that's greater than the threshold and 0 otherwise. If t

Re: Problem in running MLlib SVM

2015-12-01 Thread Joseph Bradley
Oh, sorry about that. I forgot that's the behavior when the threshold is not set. My guess would be that you need more iterations, or that the regParam needs to be tuned. I'd recommend testing on some of the LibSVM datasets. They have a lot, and you can find existing examples (and results) for

Re: Grid search with Random Forest

2015-12-01 Thread Ndjido Ardo BAR
Thanks for the clarification. Gonna test that and give you feedbacks. Ndjido On Tue, 1 Dec 2015 at 19:29, Joseph Bradley wrote: > You can do grid search if you set the evaluator to a > MulticlassClassificationEvaluator, which expects a prediction column, not a > rawPrediction column. There's a

Re: Grid search with Random Forest

2015-12-01 Thread Joseph Bradley
You can do grid search if you set the evaluator to a MulticlassClassificationEvaluator, which expects a prediction column, not a rawPrediction column. There's a JIRA for making BinaryClassificationEvaluator accept prediction instead of rawPrediction. Joseph On Tue, Dec 1, 2015 at 5:10 AM, Benjami

Re: Grid search with Random Forest

2015-12-01 Thread Benjamin Fradet
Someone correct me if I'm wrong but no there isn't one that I am aware of. Unless someone is willing to explain how to obtain the raw prediction column with the GBTClassifier. In this case I'd be happy to work on a PR. On 1 Dec 2015 8:43 a.m., "Ndjido Ardo BAR" wrote: > Hi Benjamin, > > Thanks,

Re: Bringing up JDBC Tests to trunk

2015-12-01 Thread Jacek Laskowski
On Mon, Nov 30, 2015 at 10:53 PM, Josh Rosen wrote: > In SBT, these wind up on the Docker JDBC tests' classpath as a transitive > dependency of the `spark-sql` test JAR. However, what we should be doing is > adding them as explicit test dependencies of the `docker-integration-tests` > subproject,

Re: How to add 1.5.2 support to ec2/spark_ec2.py ?

2015-12-01 Thread Alexander Pivovarov
Did 6 min ago On Tue, Dec 1, 2015 at 12:49 AM, Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > Yeah - that needs to be changed as well. Could you send a PR to fix this ? > > Shivaram > > On Tue, Dec 1, 2015 at 12:32 AM, Alexander Pivovarov > wrote: > > Thank you, > > I looked at mas

Re: How to add 1.5.2 support to ec2/spark_ec2.py ?

2015-12-01 Thread Shivaram Venkataraman
Yeah - that needs to be changed as well. Could you send a PR to fix this ? Shivaram On Tue, Dec 1, 2015 at 12:32 AM, Alexander Pivovarov wrote: > Thank you, > I looked at master branch. I did not realize that it's behind branch-1.5 > BTW, line 54 still has SPARK_EC2_VERSION = "1.5.1" > > On Tue,

Re: How to add 1.5.2 support to ec2/spark_ec2.py ?

2015-12-01 Thread Alexander Pivovarov
Thank you, I looked at master branch. I did not realize that it's behind branch-1.5 BTW, line 54 still has SPARK_EC2_VERSION = "1.5.1" On Tue, Dec 1, 2015 at 12:22 AM, Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > Yeah we just need to add 1.5.2 as in > > https://github.com/apache/s

Re: How to add 1.5.2 support to ec2/spark_ec2.py ?

2015-12-01 Thread Shivaram Venkataraman
Yeah we just need to add 1.5.2 as in https://github.com/apache/spark/commit/97956669053646f00131073358e53b05d0c3d5d0#diff-ada66bbeb2f1327b508232ef6c3805a5 to the master branch as well Thanks Shivaram On Mon, Nov 30, 2015 at 11:38 PM, Alexander Pivovarov wrote: > just want to follow up > > On N