Re: Converting SparkSQL query to Scala query

2015-03-23 Thread Dean Wampler
SQL keyword. HTH, Dean Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition http://shop.oreilly.com/product/0636920033073.do (O'Reilly) Typesafe http://typesafe.com @deanwampler http://twitter.com/deanwampler http://polyglotprogramming.com On Mon, Mar 23, 2015 at 11:42 AM, nishitd nishitde

Re: Error while installing Spark 1.3.0 on local machine

2015-03-22 Thread Dean Wampler
Any particular reason you're not just downloading a build from http://spark.apache.org/downloads.html Even if you aren't using Hadoop, any of those builds will work. If you want to build from source, the Maven build is more reliable. dean Dean Wampler, Ph.D. Author: Programming Scala, 2nd

Re: can distinct transform applied on DStream?

2015-03-22 Thread Dean Wampler
aDstream.transform(_.distinct()) will only make the elements of each RDD in the DStream distinct, not for the whole DStream globally. Is that what you're seeing? Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition http://shop.oreilly.com/product/0636920033073.do (O'Reilly) Typesafe http

Re: How Does aggregate work

2015-03-22 Thread Dean Wampler
+ ... (2 + (2 + (2 + 0 + p_1) + p_2) + p_3) ...) Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition http://shop.oreilly.com/product/0636920033073.do (O'Reilly) Typesafe http://typesafe.com @deanwampler http://twitter.com/deanwampler http://polyglotprogramming.com On Sun, Mar 22, 2015

Re: [SQL] Self join with ArrayType columns problems

2015-01-26 Thread Dean Wampler
You are creating a HiveContext, then using the sql method instead of hql. Is that deliberate? The code doesn't work if you replace HiveContext with SQLContext. Lots of exceptions are thrown, but I don't have time to investigate now. dean Dean Wampler, Ph.D. Author: Programming Scala, 2nd

Re: Spark Project Fails to run multicore in local mode.

2015-01-08 Thread Dean Wampler
archive http://apache-spark-user-list.1001560.n3.nabble.com/ at Nabble.com. -- Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition http://shop.oreilly.com/product/0636920033073.do (O'Reilly) Typesafe http://typesafe.com @deanwampler http://twitter.com/deanwampler http

Re: scala Vector vs mllib Vector

2014-10-04 Thread Dean Wampler
(the unchanged parts) to make efficient copies. Also, Scala Vector isn't designed to represent sparse vectors. dean Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition http://shop.oreilly.com/product/0636920033073.do (O'Reilly) Typesafe http://typesafe.com @deanwampler http://twitter.com

Re: SparkSQL Thriftserver in Mesos

2014-09-22 Thread Dean Wampler
. https://spark.apache.org/docs/latest/running-on-mesos.html Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition http://shop.oreilly.com/product/0636920033073.do (O'Reilly) Typesafe http://typesafe.com @deanwampler http://twitter.com/deanwampler http://polyglotprogramming.com On Mon, Sep 22

Re: Dependency Problem with Spark / ScalaTest / SBT

2014-09-14 Thread Dean Wampler
Can you post your whole SBT build file(s)? Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition http://shop.oreilly.com/product/0636920033073.do (O'Reilly) Typesafe http://typesafe.com @deanwampler http://twitter.com/deanwampler http://polyglotprogramming.com On Wed, Sep 10, 2014 at 6:48

Re: Dependency Problem with Spark / ScalaTest / SBT

2014-09-14 Thread Dean Wampler
Sorry, I meant any *other* SBT files. However, what happens if you remove the line: exclude(org.eclipse.jetty.orbit, javax.servlet) dean Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition http://shop.oreilly.com/product/0636920033073.do (O'Reilly) Typesafe http://typesafe.com

Re: Issue with Spark on EC2 using spark-ec2 script

2014-08-01 Thread Dean Wampler
It looked like you were running in standalone mode (master set to local[4]). That's how I ran it. Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition http://shop.oreilly.com/product/0636920033073.do (O'Reilly) Typesafe http://typesafe.com @deanwampler http://twitter.com/deanwampler http

Re: Recommended pipeline automation tool? Oozie?

2014-07-15 Thread Dean Wampler
: http://apache-spark-user-list.1001560.n3.nabble.com/Recommended-pipeline-automation-tool-Oozie-tp9319.html Sent from the Apache Spark User List mailing list archive at Nabble.com. -- Dean Wampler, Ph.D. Typesafe @deanwampler http://typesafe.com http://polyglotprogramming.com

Re: Spark vs Google cloud dataflow

2014-06-27 Thread Dean Wampler
in the Hadoop ecosystem. I think Dataflows is more than that but yeah that seems to be some of the 'language'. It is similar in that it is a distributed collection abstraction. -- Dean Wampler, Ph.D. Typesafe @deanwampler http://typesafe.com http://polyglotprogramming.com

Re: Announcing Spark 1.0.0

2014-05-30 Thread Dean Wampler
scratch the surface - check out the release notes here: http://spark.apache.org/releases/spark-release-1-0-0.html Note that since release artifacts were posted recently, certain mirrors may not have working downloads for a few hours. - Patrick -- Dean Wampler, Ph.D. Typesafe @deanwampler

Re: My talk on Spark: The Next Top (Compute) Model

2014-05-01 Thread Dean Wampler
/GCS). Why configure Hadoop if you don't have to. On Thu, May 1, 2014 at 12:25 AM, Dean Wampler deanwamp...@gmail.comwrote: I meant to post this last week, but this is a talk I gave at the Philly ETE conf. last week: http://www.slideshare.net/deanwampler/spark-the-next-top-compute-model

Re: My talk on Spark: The Next Top (Compute) Model

2014-05-01 Thread Dean Wampler
have to. On Thu, May 1, 2014 at 12:25 AM, Dean Wampler deanwamp...@gmail.comwrote: I meant to post this last week, but this is a talk I gave at the Philly ETE conf. last week: http://www.slideshare.net/deanwampler/spark-the-next-top-compute-model Also here: http

Re: Spark Training

2014-05-01 Thread Dean Wampler
in context: Spark Traininghttp://apache-spark-user-list.1001560.n3.nabble.com/Spark-Training-tp5166.html Sent from the Apache Spark User List mailing list archivehttp://apache-spark-user-list.1001560.n3.nabble.com/at Nabble.com. -- Dean Wampler, Ph.D. Typesafe @deanwampler http

My talk on Spark: The Next Top (Compute) Model

2014-04-30 Thread Dean Wampler
I meant to post this last week, but this is a talk I gave at the Philly ETE conf. last week: http://www.slideshare.net/deanwampler/spark-the-next-top-compute-model Also here: http://polyglotprogramming.com/papers/Spark-TheNextTopComputeModel.pdf dean -- Dean Wampler, Ph.D. Typesafe

Re: K-means with large K

2014-04-28 Thread Dean Wampler
of this? Thanks, Dave -- Dean Wampler, Ph.D. Typesafe @deanwampler http://typesafe.com http://polyglotprogramming.com

Re: Using Spark for Divide-and-Conquer Algorithms

2014-04-11 Thread Dean Wampler
and Distributed Systems Shanghai Jiao Tong University Email: yanzhe...@gmail.com Sent with Sparrow http://www.sparrowmailapp.com/?sig -- Dean Wampler, Ph.D. Typesafe @deanwampler http://typesafe.com http://polyglotprogramming.com

Re: Hybrid GPU CPU computation

2014-04-11 Thread Dean Wampler
, 2014 at 2:38 PM, Jaonary Rabarisoa jaon...@gmail.comwrote: Hi all, I'm just wondering if hybrid GPU/CPU computation is something that is feasible with spark ? And what should be the best way to do it. Cheers, Jaonary -- Dean Wampler, Ph.D. Typesafe @deanwampler http://typesafe.com http

Re: Spark - ready for prime time?

2014-04-10 Thread Dean Wampler
intrinsic reasons for this to be impossible? Sorry again for the giant mail, and thanks for any insights! Andras -- Dean Wampler, Ph.D. Typesafe @deanwampler http://typesafe.com http://polyglotprogramming.com

Re: Spark - ready for prime time?

2014-04-10 Thread Dean Wampler
comment? Sent from Windows Mail *From:* Dean Wampler deanwamp...@gmail.com *Sent:* Thursday, April 10, 2014 7:39 AM *To:* Spark Users user@spark.apache.org *Cc:* Daniel Darabos daniel.dara...@lynxanalytics.com, Andras Barjakandras.bar...@lynxanalytics.com Spark has been endorsed by Cloudera

<    1   2