Re: A basic question

2019-06-17 Thread Shyam P
Thank you so much Deepak. Let me implement and update you. Hope it works. Any short-comings I need to consider or take care of ? Regards, Shyam On Mon, Jun 17, 2019 at 12:39 PM Deepak Sharma wrote: > You can follow this example: > >

Re: A basic question

2019-06-17 Thread Deepak Sharma
You can follow this example: https://docs.spring.io/spring-hadoop/docs/current/reference/html/springandhadoop-spark.html On Mon, Jun 17, 2019 at 12:27 PM Shyam P wrote: > I am developing a spark job using java1.8v. > > Is it possible to write a spark app using spring-boot technology? > Did

A basic question

2019-06-17 Thread Shyam P
I am developing a spark job using java1.8v. Is it possible to write a spark app using spring-boot technology? Did anyone tried it ? if so how it should be done? Regards, Shyam

Re: Basic question. Access MongoDB data in Spark.

2016-06-13 Thread Prajwal Tuladhar
May be try opening an issue in their GH repo https://github.com/Stratio/Spark-MongoDB On Mon, Jun 13, 2016 at 4:10 PM, Umair Janjua wrote: > Anybody knows the stratio's mailing list? I cant seem to find it. Cheers > > On Mon, Jun 13, 2016 at 6:02 PM, Ted Yu

Re: Basic question. Access MongoDB data in Spark.

2016-06-13 Thread Umair Janjua
Anybody knows the stratio's mailing list? I cant seem to find it. Cheers On Mon, Jun 13, 2016 at 6:02 PM, Ted Yu wrote: > Have you considered posting the question on stratio's mailing list ? > > You may get faster response there. > > > On Mon, Jun 13, 2016 at 8:09 AM, Umair

Re: Basic question. Access MongoDB data in Spark.

2016-06-13 Thread Ted Yu
Have you considered posting the question on stratio's mailing list ? You may get faster response there. On Mon, Jun 13, 2016 at 8:09 AM, Umair Janjua wrote: > Hi guys, > > I have this super basic problem which I cannot figure out. Can somebody > give me a hint. > >

Re: Fw: Basic question on using one's own classes in the Scala app

2016-06-06 Thread Marco Mistroni
y.CheckpointDirectory(ProgramName) > > However, I am getting a compilation error as expected > > not found: type getCheckpointDirectory > [error] val getCheckpointDirectory = new getCheckpointDirectory > [error] ^ > [error] one error found &g

Fw: Basic question on using one's own classes in the Scala app

2016-06-06 Thread Ashok Kumar
tDirectory[error]                                       ^[error] one error found[error] (compile:compileIncremental) Compilation failed So a basic question, in order for compilation to work do I need to create a package for my jar file or add dependency like the following I do in sbt libraryDependencies += "org.apache.spark"

Re: Basic question on using one's own classes in the Scala app

2016-06-05 Thread Ashok Kumar
Thank you. I added this as dependency libraryDependencies += "com.databricks" % "apps.twitter_classifier" % "1.0.0" That number at the end I chose arbitrary? Is that correct  Also in my TwitterAnalyzer.scala I added this linw import com.databricks.apps.twitter_classifier._ Now I am getting this

Re: Basic question on using one's own classes in the Scala app

2016-06-05 Thread Jacek Laskowski
On Sun, Jun 5, 2016 at 9:01 PM, Ashok Kumar wrote: > Now I have added this > > libraryDependencies += "com.databricks" % "apps.twitter_classifier" > > However, I am getting an error > > > error: No implicit for Append.Value[Seq[sbt.ModuleID], >

Re: Basic question on using one's own classes in the Scala app

2016-06-05 Thread Ashok Kumar
ry.CheckpointDirectory(ProgramName) However, I am getting a compilation error as expected not found: type getCheckpointDirectory[error]     val getCheckpointDirectory =   new getCheckpointDirectory[error]                                       ^[error] one error found[error] (compile:compileIncremental) Com

Re: Basic question on using one's own classes in the Scala app

2016-06-05 Thread Ted Yu
me.trim >val getCheckpointDirectory = new getCheckpointDirectory >val hdfsDir = getCheckpointDirectory.CheckpointDirectory(ProgramName) > > However, I am getting a compilation error as expected > > not found: type getCheckpointDirectory > [error] val getCheckpointDirectory = new getCheckpo

Re: Basic question on using one's own classes in the Scala app

2016-06-05 Thread Ashok Kumar
        ^[error] one error found[error] (compile:compileIncremental) Compilation failed So a basic question, in order for compilation to work do I need to create a package for my jar file or add dependency like the following I do in sbt libraryDependencies += "org.apache.spark"

Re: Basic question on using one's own classes in the Scala app

2016-06-05 Thread Ted Yu
val hdfsDir = getCheckpointDirectory.CheckpointDirectory(ProgramName) > > However, I am getting a compilation error as expected > > not found: type getCheckpointDirectory > [error] val getCheckpointDirectory = new getCheckpointDirectory > [error]

RE: a basic question on first use of PySpark shell and example, which is failing

2016-02-29 Thread Taylor, Ronald C
aylo...@gmail.com; Taylor, Ronald C Subject: RE: a basic question on first use of PySpark shell and example, which is failing HI Yin, My Classpath is set to: CLASSPATH=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/jars/*:/people/rtaylor/SparkWork/DataAlgUtils:. And there is indeed a spark-core

RE: a basic question on first use of PySpark shell and example, which is failing

2016-02-29 Thread Taylor, Ronald C
tp://www.pnnl.gov/science/staff/staff_info.asp?staff_num=7048 From: Yin Yang [mailto:yy201...@gmail.com] Sent: Monday, February 29, 2016 2:27 PM To: Taylor, Ronald C Cc: Jules Damji; user@spark.apache.org; ronald.taylo...@gmail.com Subject: Re: a basic question on first use of PySpark shell a

Re: a basic question on first use of PySpark shell and example, which is failing

2016-02-29 Thread Yin Yang
> > phone: (509) 372-6568, email: ronald.tay...@pnnl.gov > > web page: http://www.pnnl.gov/science/staff/staff_info.asp?staff_num=7048 > > > > *From:* Jules Damji [mailto:dmat...@comcast.net] > *Sent:* Sunday, February 28, 2016 10:07 PM > *To:* Taylor, Ronald C > *

Re: a basic question on first use of PySpark shell and example, which is failing

2016-02-28 Thread Jules Damji
Hello Ronald, Since you have placed the file under HDFS, you might same change the path name to: val lines = sc.textFile("hdfs://user/taylor/Spark/Warehouse.java") Sent from my iPhone Pardon the dumb thumb typos :) > On Feb 28, 2016, at 9:36 PM, Taylor, Ronald C

a basic question on first use of PySpark shell and example, which is failing

2016-02-28 Thread Taylor, Ronald C
Hello folks, I am a newbie, and am running Spark on a small Cloudera CDH 5.5.1 cluster at our lab. I am trying to use the PySpark shell for the first time. and am attempting to duplicate the documentation example of creating an RDD which I called "lines" using a text file. I placed a a

I am very new to Spark. I have a very basic question. I have an array of values: listofECtokens: Array[String] = Array(EC-17A5206955089011B, EC-17A5206955089011A) I want to filter an RDD for all of

2015-09-09 Thread prachicsa
I am very new to Spark. I have a very basic question. I have an array of values: listofECtokens: Array[String] = Array(EC-17A5206955089011B, EC-17A5206955089011A) I want to filter an RDD for all of these token values. I tried the following way: val ECtokens = for (token <- listofECtok

Re: I am very new to Spark. I have a very basic question. I have an array of values: listofECtokens: Array[String] = Array(EC-17A5206955089011B, EC-17A5206955089011A) I want to filter an RDD for all o

2015-09-09 Thread Ted Yu
hanks > Best Regards > >> On Wed, Sep 9, 2015 at 3:25 PM, prachicsa <prachi...@gmail.com> wrote: >> >> >> I am very new to Spark. >> >> I have a very basic question. I have an array of values: >> >> listofECtokens: Array[String] =

Re: I am very new to Spark. I have a very basic question. I have an array of values: listofECtokens: Array[String] = Array(EC-17A5206955089011B, EC-17A5206955089011A) I want to filter an RDD for all o

2015-09-09 Thread Akhil Das
.contains(item)) found = true } found }).collect() Output: res8: Array[String] = Array(This contains EC-17A5206955089011B) Thanks Best Regards On Wed, Sep 9, 2015 at 3:25 PM, prachicsa <prachi...@gmail.com> wrote: > > > I am very new to Spark. > > I have a very b

Re: initial basic question from new user

2014-06-12 Thread Gerard Maas
The goal of rdd.persist is to created a cached rdd that breaks the DAG lineage. Therefore, computations *in the same job* that use that RDD can re-use that intermediate result, but it's not meant to survive between job runs. for example: val baseData =

Re: initial basic question from new user

2014-06-12 Thread Christopher Nguyen
Toby, #saveAsTextFile() and #saveAsObjectFile() are probably what you want for your use case. As for Parquet support, that's newly arrived in Spark 1.0.0 together with SparkSQL so continue to watch this space. Gerard's suggestion to look at JobServer, which you can generalize as building a

Re: initial basic question from new user

2014-06-12 Thread FRANK AUSTIN NOTHAFT
RE: Given that our agg sizes will exceed memory, we expect to cache them to disk, so save-as-object (assuming there are no out of the ordinary performance issues) may solve the problem, but I was hoping to store data is a column orientated format. However I think this in general is not possible

Re: initial basic question from new user

2014-06-12 Thread Andre Schumacher
Hi, On 06/12/2014 05:47 PM, Toby Douglass wrote: In these future jobs, when I come to load the aggregted RDD, will Spark load and only load the columns being accessed by the query? or will Spark load everything, to convert it into an internal representation, and then execute the query?

Re: initial basic question from new user

2014-06-12 Thread Toby Douglass
On Thu, Jun 12, 2014 at 4:48 PM, Andre Schumacher schum...@icsi.berkeley.edu wrote: On 06/12/2014 05:47 PM, Toby Douglass wrote: In these future jobs, when I come to load the aggregted RDD, will Spark load and only load the columns being accessed by the query? or will Spark load