Thank you so much Deepak.
Let me implement and update you. Hope it works.
Any short-comings I need to consider or take care of ?
Regards,
Shyam
On Mon, Jun 17, 2019 at 12:39 PM Deepak Sharma
wrote:
> You can follow this example:
>
>
You can follow this example:
https://docs.spring.io/spring-hadoop/docs/current/reference/html/springandhadoop-spark.html
On Mon, Jun 17, 2019 at 12:27 PM Shyam P wrote:
> I am developing a spark job using java1.8v.
>
> Is it possible to write a spark app using spring-boot technology?
> Did
I am developing a spark job using java1.8v.
Is it possible to write a spark app using spring-boot technology?
Did anyone tried it ? if so how it should be done?
Regards,
Shyam
May be try opening an issue in their GH repo
https://github.com/Stratio/Spark-MongoDB
On Mon, Jun 13, 2016 at 4:10 PM, Umair Janjua
wrote:
> Anybody knows the stratio's mailing list? I cant seem to find it. Cheers
>
> On Mon, Jun 13, 2016 at 6:02 PM, Ted Yu
Anybody knows the stratio's mailing list? I cant seem to find it. Cheers
On Mon, Jun 13, 2016 at 6:02 PM, Ted Yu wrote:
> Have you considered posting the question on stratio's mailing list ?
>
> You may get faster response there.
>
>
> On Mon, Jun 13, 2016 at 8:09 AM, Umair
Have you considered posting the question on stratio's mailing list ?
You may get faster response there.
On Mon, Jun 13, 2016 at 8:09 AM, Umair Janjua
wrote:
> Hi guys,
>
> I have this super basic problem which I cannot figure out. Can somebody
> give me a hint.
>
>
y.CheckpointDirectory(ProgramName)
>
> However, I am getting a compilation error as expected
>
> not found: type getCheckpointDirectory
> [error] val getCheckpointDirectory = new getCheckpointDirectory
> [error] ^
> [error] one error found
&g
tDirectory[error]
^[error] one error found[error] (compile:compileIncremental) Compilation failed
So a basic question, in order for compilation to work do I need to create a
package for my jar file or add dependency like the following I do in sbt
libraryDependencies += "org.apache.spark"
Thank you.
I added this as dependency
libraryDependencies += "com.databricks" % "apps.twitter_classifier" % "1.0.0"
That number at the end I chose arbitrary? Is that correct
Also in my TwitterAnalyzer.scala I added this linw
import com.databricks.apps.twitter_classifier._
Now I am getting this
On Sun, Jun 5, 2016 at 9:01 PM, Ashok Kumar
wrote:
> Now I have added this
>
> libraryDependencies += "com.databricks" % "apps.twitter_classifier"
>
> However, I am getting an error
>
>
> error: No implicit for Append.Value[Seq[sbt.ModuleID],
>
ry.CheckpointDirectory(ProgramName)
However, I am getting a compilation error as expected
not found: type getCheckpointDirectory[error] val getCheckpointDirectory =
new getCheckpointDirectory[error]
^[error] one error found[error] (compile:compileIncremental) Com
me.trim
>val getCheckpointDirectory = new getCheckpointDirectory
>val hdfsDir = getCheckpointDirectory.CheckpointDirectory(ProgramName)
>
> However, I am getting a compilation error as expected
>
> not found: type getCheckpointDirectory
> [error] val getCheckpointDirectory = new getCheckpo
^[error] one error found[error] (compile:compileIncremental) Compilation failed
So a basic question, in order for compilation to work do I need to create a
package for my jar file or add dependency like the following I do in sbt
libraryDependencies += "org.apache.spark"
val hdfsDir = getCheckpointDirectory.CheckpointDirectory(ProgramName)
>
> However, I am getting a compilation error as expected
>
> not found: type getCheckpointDirectory
> [error] val getCheckpointDirectory = new getCheckpointDirectory
> [error]
aylo...@gmail.com; Taylor, Ronald C
Subject: RE: a basic question on first use of PySpark shell and example, which
is failing
HI Yin,
My Classpath is set to:
CLASSPATH=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/jars/*:/people/rtaylor/SparkWork/DataAlgUtils:.
And there is indeed a spark-core
tp://www.pnnl.gov/science/staff/staff_info.asp?staff_num=7048
From: Yin Yang [mailto:yy201...@gmail.com]
Sent: Monday, February 29, 2016 2:27 PM
To: Taylor, Ronald C
Cc: Jules Damji; user@spark.apache.org; ronald.taylo...@gmail.com
Subject: Re: a basic question on first use of PySpark shell a
>
> phone: (509) 372-6568, email: ronald.tay...@pnnl.gov
>
> web page: http://www.pnnl.gov/science/staff/staff_info.asp?staff_num=7048
>
>
>
> *From:* Jules Damji [mailto:dmat...@comcast.net]
> *Sent:* Sunday, February 28, 2016 10:07 PM
> *To:* Taylor, Ronald C
> *
Hello Ronald,
Since you have placed the file under HDFS, you might same change the path name
to:
val lines = sc.textFile("hdfs://user/taylor/Spark/Warehouse.java")
Sent from my iPhone
Pardon the dumb thumb typos :)
> On Feb 28, 2016, at 9:36 PM, Taylor, Ronald C
Hello folks,
I am a newbie, and am running Spark on a small Cloudera CDH 5.5.1 cluster at
our lab. I am trying to use the PySpark shell for the first time. and am
attempting to duplicate the documentation example of creating an RDD which I
called "lines" using a text file.
I placed a a
I am very new to Spark.
I have a very basic question. I have an array of values:
listofECtokens: Array[String] = Array(EC-17A5206955089011B,
EC-17A5206955089011A)
I want to filter an RDD for all of these token values. I tried the following
way:
val ECtokens = for (token <- listofECtok
hanks
> Best Regards
>
>> On Wed, Sep 9, 2015 at 3:25 PM, prachicsa <prachi...@gmail.com> wrote:
>>
>>
>> I am very new to Spark.
>>
>> I have a very basic question. I have an array of values:
>>
>> listofECtokens: Array[String] =
.contains(item)) found = true
}
found
}).collect()
Output:
res8: Array[String] = Array(This contains EC-17A5206955089011B)
Thanks
Best Regards
On Wed, Sep 9, 2015 at 3:25 PM, prachicsa <prachi...@gmail.com> wrote:
>
>
> I am very new to Spark.
>
> I have a very b
The goal of rdd.persist is to created a cached rdd that breaks the DAG
lineage. Therefore, computations *in the same job* that use that RDD can
re-use that intermediate result, but it's not meant to survive between job
runs.
for example:
val baseData =
Toby, #saveAsTextFile() and #saveAsObjectFile() are probably what you want
for your use case. As for Parquet support, that's newly arrived in Spark
1.0.0 together with SparkSQL so continue to watch this space.
Gerard's suggestion to look at JobServer, which you can generalize as
building a
RE:
Given that our agg sizes will exceed memory, we expect to cache them to
disk, so save-as-object (assuming there are no out of the ordinary
performance issues) may solve the problem, but I was hoping to store data
is a column orientated format. However I think this in general is not
possible
Hi,
On 06/12/2014 05:47 PM, Toby Douglass wrote:
In these future jobs, when I come to load the aggregted RDD, will Spark
load and only load the columns being accessed by the query? or will Spark
load everything, to convert it into an internal representation, and then
execute the query?
On Thu, Jun 12, 2014 at 4:48 PM, Andre Schumacher
schum...@icsi.berkeley.edu wrote:
On 06/12/2014 05:47 PM, Toby Douglass wrote:
In these future jobs, when I come to load the aggregted RDD, will Spark
load and only load the columns being accessed by the query? or will
Spark
load
27 matches
Mail list logo