Re: SparkR : glm model

2016-06-11 Thread Sun Rui
You were looking at some old code. poisson family is supported in latest master branch. You can try spark 2.0 preview release from http://spark.apache.org/news/spark-2.0.0-preview.html > On Jun 10, 2016, at 12:14, april_ZMQ

Re: Slow collecting of large Spark Data Frames into R

2016-06-11 Thread Sun Rui
Hi, Jonathan, Thanks for reporting. This is a known issue that the community would like to address later. Please refer to https://issues.apache.org/jira/browse/SPARK-14037. It would be better that you can profile your use case using the method discussed in the JIRA issue and paste the

Re: Running Spark in Standalone or local modes

2016-06-11 Thread Mich Talebzadeh
Hi Ashok Your points: " I know I can start spark-shell by launching the shell itself spark-shell Now I know that in standalone mode I can also connect to master spark-shell --master spark://:7077 My point is what are the differences between these two start-up modes for spark-shell? If I

Re: Running Spark in Standalone or local modes

2016-06-11 Thread Gavin Yue
Sorry I have a typo. Which means spark does not use yarn or mesos in standalone mode... > On Jun 11, 2016, at 14:35, Mich Talebzadeh wrote: > > Hi Gavin, > > I believe in standalone mode a simple cluster manager is included with Spark > that makes it easy to set

Re: Running Spark in Standalone or local modes

2016-06-11 Thread Mich Talebzadeh
Hi Gavin, I believe in standalone mode a simple cluster manager is included with Spark that makes it easy to set up a cluster. It does not rely on YARN or Mesos. In summary this is from my notes: - Spark Local - Spark runs on the local host. This is the simplest set up and best

Re: Running Spark in Standalone or local modes

2016-06-11 Thread Gavin Yue
The standalone mode is against Yarn mode or Mesos mode, which means spark uses Yarn or Mesos as cluster managements. Local mode is actually a standalone mode which everything runs on the single local machine instead of remote clusters. That is my understanding. On Sat, Jun 11, 2016 at 12:40

Accuracy of BinaryClassificationMetrics

2016-06-11 Thread Marco Mistroni
HI all which method shall i use to verify the accuracy of a BinaryClassificationMetrics ? the multiClassMetrics has a precision() method but that is missing on the BinaryClassificationMetrics thanks marco

Re: Running Spark in Standalone or local modes

2016-06-11 Thread Ashok Kumar
Thank you for grateful I know I can start spark-shell by launching the shell itself spark-shell  Now I know that in standalone mode I can also connect to master spark-shell --master spark://:7077 My point is what are the differences between these two start-up modes for spark-shell? If I start

Re: Book for Machine Learning (MLIB and other libraries on Spark)

2016-06-11 Thread Mich Talebzadeh
yes absolutely Ted. Thanks for highlighting it Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * http://talebzadehmich.wordpress.com On 11

Re: Running Spark in Standalone or local modes

2016-06-11 Thread Mohammad Tariq
Hi Ashok, In local mode all the processes run inside a single jvm, whereas in standalone mode we have separate master and worker processes running in their own jvms. To quickly test your code from within your IDE you could probable use the local mode. However, to get a real feel of how Spark

Running Spark in Standalone or local modes

2016-06-11 Thread Ashok Kumar
Hi, What is the difference between running Spark in Local mode or standalone mode? Are they the same. If they are not which is best suited for non prod work. I am also aware that one can run Spark in Yarn mode as well. Thanks

Re: Book for Machine Learning (MLIB and other libraries on Spark)

2016-06-11 Thread Ted Yu
Another source is the presentation on various ocnferences. e.g. http://www.slideshare.net/databricks/apache-spark-mllib-20-preview-data-science-and-production FYI On Sat, Jun 11, 2016 at 8:47 AM, Mich Talebzadeh wrote: > Interesting. > > The pace of development in

Re: Book for Machine Learning (MLIB and other libraries on Spark)

2016-06-11 Thread Mich Talebzadeh
Interesting. The pace of development in this field is such that practically every single book in Big Data landscape gets out of data before the ink dries on it :) I concur that they serve as good reference for starters but in my opinion the best way to learn is to start from on-line docs (and

Re: Book for Machine Learning (MLIB and other libraries on Spark)

2016-06-11 Thread Ted Yu
https://www.amazon.com/Machine-Learning-Spark-Powerful-Algorithms/dp/1783288515/ref=sr_1_1?ie=UTF8=1465657706=8-1=spark+mllib https://www.amazon.com/Spark-Practical-Machine-Learning-Chinese/dp/7302420424/ref=sr_1_3?ie=UTF8=1465657706=8-3=spark+mllib

Book for Machine Learning (MLIB and other libraries on Spark)

2016-06-11 Thread Deepak Goel
Hey Namaskara~Nalama~Guten Tag~Bonjour I am a newbie to Machine Learning (MLIB and other libraries on Spark) Which would be the best book to learn up? Thanks Deepak -- Keigu Deepak 73500 12833 www.simtree.net, dee...@simtree.net deic...@gmail.com LinkedIn: www.linkedin.com/in/deicool

Re: SAS_TO_SPARK_SQL_(Could be a Bug?)

2016-06-11 Thread Ajay Chander
I tried implementing the same functionality through Scala as well. But no luck so far. Just wondering if anyone here tried using Spark SQL to read SAS dataset? Thank you Regards, Ajay On Friday, June 10, 2016, Ajay Chander wrote: > Mich, I completely agree with you. I

Big Data Interview

2016-06-11 Thread Chaturvedi Chola
Good book on interview preparation for big data https://notionpress.com/read/big-data-interview-faqs