Re: n

2016-04-27 Thread shengshanzhang
your build.sb seems a little complexed. thank you a lot. and the example in the official spark website, explains how to utilize spark-sql based on spark-shell, there is no instructions about how to writing a Self-Contained Applications. for a learner who is not Familiar with with scala or ja

Re: n

2016-04-27 Thread shengshanzhang
thanks a lot. I add a spark-sql dependence in build.sb as red line shows. name := "Simple Project" version := "1.0" scalaVersion := "2.10.5" libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.1" libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.6.1" ~ > 在 2016年4月27日,下

Re: n

2016-04-27 Thread shengshanzhang
thanks a lot. I add a spark-sql dependence in build.sb as red line shows. name := "Simple Project" version := "1.0" scalaVersion := "2.10.5" libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.1" libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.6.1" ~ > 在 2016年4月27日,下

Re: n

2016-04-27 Thread ramesh reddy
Spark Sql jar has to be added as a dependency in build.sbt. On Wednesday, 27 April 2016 1:57 PM, shengshanzhang wrote: Hello :     my code is as follows: --- import org.apache.spark.{SparkConf, SparkContext} import

Re: n

2016-04-27 Thread Marco Mistroni
Hi please share your build.sbt here's mine for reference (using Spark 1.6.1 + scala 2.10) (pls ignore extra stuff i have added for assembly and logging) // Set the project name to the string 'My Project' name := "SparkExamples" // The := method used in Name and Version is one of two fundamental

Re: N kafka topics vs N spark Streaming

2015-06-19 Thread Akhil Das
Like this? val add_msgs = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder]( ssc, kafkaParams, Array("add").toSet) val delete_msgs = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder]( ssc, kafkaParams, Array("delete").toSet) val upd

Re: N-Fold validation and RDD partitions

2014-03-25 Thread Jaonary Rabarisoa
There is also a "randomSplit" method in the latest version of spark https://github.com/apache/incubator-spark/blob/master/core/src/main/scala/org/apache/spark/rdd/RDD.scala On Tue, Mar 25, 2014 at 1:21 AM, Holden Karau wrote: > There is also https://github.com/apache/spark/pull/18 against the c

Re: N-Fold validation and RDD partitions

2014-03-24 Thread Holden Karau
There is also https://github.com/apache/spark/pull/18 against the current repo which may be easier to apply. On Fri, Mar 21, 2014 at 8:58 AM, Hai-Anh Trinh wrote: > Hi Jaonary, > > You can find the code for k-fold CV in > https://github.com/apache/incubator-spark/pull/448. I have not find the >

Re: N-Fold validation and RDD partitions

2014-03-24 Thread Walrus theCat
If someone wanted / needed to implement this themselves, are partitions the correct way to go? Any tips on how to get started (say, dividing an RDD into 5 parts)? On Fri, Mar 21, 2014 at 9:51 AM, Jaonary Rabarisoa wrote: > Thank you Hai-Anh. Are the files CrossValidation.scala and > RandomS

Re: N-Fold validation and RDD partitions

2014-03-21 Thread Jaonary Rabarisoa
Thank you Hai-Anh. Are the files CrossValidation.scala and RandomSplitRDD.scala enough to use it ? I'm currently using spark 0.9.0 and I to avoid to rebuild every thing. On Fri, Mar 21, 2014 at 4:58 PM, Hai-Anh Trinh wrote: > Hi Jaonary, > > You can find the code for k-fold CV in > https:/

Re: N-Fold validation and RDD partitions

2014-03-21 Thread Hai-Anh Trinh
Hi Jaonary, You can find the code for k-fold CV in https://github.com/apache/incubator-spark/pull/448. I have not find the time to resubmit the pull to latest master. On Fri, Mar 21, 2014 at 8:46 PM, Sanjay Awatramani wrote: > Hi Jaonary, > > I believe the n folds should be mapped into n Keys i

Re: N-Fold validation and RDD partitions

2014-03-21 Thread Sanjay Awatramani
Hi Jaonary, I believe the n folds should be mapped into n Keys in spark using a map function. You can reduce the returned PairRDD and you should get your metric. I don't understand partitions fully, but from whatever I understand of it, they aren't required in your scenario. Regards, Sanjay