Re: Some questions about cached data in Livy

Saisai Shao Wed, 11 Jul 2018 05:17:38 -0700

Hi Wandong,

Livy's shared object mechanism mainly used to share objects between
different Livy jobs, this is mainly used for Job API. For example job A
create a object Foo which wants to be accessed by Job B, then user could
store this object Foo into JobContext with a provided name, after that Job
B could get this object by the name.


This is different from Spark's cache mechanism. What you mentioned above
(tmp table) is a Spark provided table cache mechanism, which is unrelated
to Livy.



Wandong Wu <[email protected]> 于2018年7月11日周三 下午5:46写道：

> Dear Sir or Madam:
>
>       I am a Livy beginner. I use Livy, because within an interactive
> session, different spark jobs could share cached RDDs or DataFrames.
>
>       When I read some parquet files and create a table called “TmpTable”.
> The following queries will use this table. Does it mean this table has been
> cached?
>       If cached, where is the table cached? The table is cached in Livy or
> Spark cluster?
>
>       Spark also supports cache function.  When I read some parquet files
> and create a table called “TmpTable2”. I add such code: sql_ctx.cacheTable(
> *'tmpTable2'*).
>       In the next query using this table. It will be cached in Spark
> cluster. Then the following queries could use this cached table.
>
>       What is the difference between cached in Livy and cached in Spark
> cluster?
>
> Thanks!
>
> Yours
> Wandong
>
>

Re: Some questions about cached data in Livy

Reply via email to