Yes, you can share RDDs with Tachyon, while keeping the data in memory.
Spark jobs can write to a Tachyon path (tachyon://host:port/path/) and
other jobs can read from the same path.
Here is a presentation that includes that use case:
http://www.slideshare.net/TachyonNexus/tachyon-presentation-at-
Hi, Praveen, have you checked out this, which might have the details you need:
https://spark-summit.org/2014/wp-content/uploads/2014/07/Spark-Job-Server-Easy-Spark-Job-Management-Chan-Chu.pdf
Best Regards,
Jia
On Jan 19, 2016, at 7:28 AM, praveen S wrote:
> Can you give me more details on Spar
Can you give me more details on Spark's jobserver.
Regards,
Praveen
On 18 Jan 2016 03:30, "Jia" wrote:
> I guess all jobs submitted through JobServer are executed in the same JVM,
> so RDDs cached by one job can be visible to all other jobs executed later.
> On Jan 17, 2016, at 3:56 PM, Mark Ham
I guess all jobs submitted through JobServer are executed in the same JVM, so
RDDs cached by one job can be visible to all other jobs executed later.
On Jan 17, 2016, at 3:56 PM, Mark Hamstra wrote:
> Yes, that is one of the basic reasons to use a jobserver/shared-SparkContext.
> Otherwise, in
Yes, that is one of the basic reasons to use a
jobserver/shared-SparkContext. Otherwise, in order share the data in an
RDD you have to use an external storage system, such as a distributed
filesystem or Tachyon.
On Sun, Jan 17, 2016 at 1:52 PM, Jia wrote:
> Thanks, Mark. Then, I guess JobServer
Thanks, Mark. Then, I guess JobServer can fundamentally solve my problem, so
that jobs can be submitted at different time and still share RDDs.
Best Regards,
Jia
On Jan 17, 2016, at 3:44 PM, Mark Hamstra wrote:
> There is a 1-to-1 relationship between Spark Applications and SparkContexts
> -
There is a 1-to-1 relationship between Spark Applications and SparkContexts
-- fundamentally, a Spark Applications is a program that creates and uses a
SparkContext, and that SparkContext is destroyed when then Application
ends. A jobserver generically and the Spark JobServer specifically is an
Ap
Hi, Mark, sorry for the confusion.
Let me clarify, when an application is submitted, the master will tell each
Spark worker to spawn an executor JVM process. All the task sets of the
application will be executed by the executor. After the application runs to
completion. The executor process wi
Hi, Mark, sorry for the confusion.
Let me clarify, when an application is submitted, the master will tell each
Spark worker to spawn an executor JVM process. All the task sets of the
application will be executed by the executor. After the application runs to
completion. The executor process wi
You've still got me confused. The SparkContext exists at the Driver, not
on an Executor.
Many Jobs can be run by a SparkContext -- it is a common pattern to use
something like the Spark Jobserver where all Jobs are run through a shared
SparkContext.
On Sun, Jan 17, 2016 at 12:57 PM, Jia Zou wro
Hi, Mark, sorry, I mean SparkContext.
I mean to change Spark into running all submitted jobs (SparkContexts) in
one executor JVM.
Best Regards,
Jia
On Sun, Jan 17, 2016 at 2:21 PM, Mark Hamstra
wrote:
> -dev
>
> What do you mean by JobContext? That is a Hadoop mapreduce concept, not
> Spark.
>
-dev
What do you mean by JobContext? That is a Hadoop mapreduce concept, not
Spark.
On Sun, Jan 17, 2016 at 7:29 AM, Jia Zou wrote:
> Dear all,
>
> Is there a way to reuse executor JVM across different JobContexts? Thanks.
>
> Best Regards,
> Jia
>
Dear all,
Is there a way to reuse executor JVM across different JobContexts? Thanks.
Best Regards,
Jia
13 matches
Mail list logo