Re: Question on accessing LLAP as data cache from external containers

Sungwoo Park Wed, 31 Jan 2018 10:08:05 -0800

Thanks for the link. My question was how to access LLAP daemon from
Containers to retrieve data for Hive jobs.


For example, a Hive job may start Tez containers, which then retrieve data
from LLAP running concurrently. In the current implementation, this is
unrealistic (because every task can be just sent to LLAP daemon), but I
wonder if this is feasible in principle (with a bit of hacking).

On Tue, Jan 30, 2018 at 3:42 PM, Jörn Franke <jornfra...@gmail.com> wrote:

> Are you looking for sth like this:
> https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/
> CentralizedCacheManagement.html
>
> To answer your original question: why not implement the whole job in Hive?
> Or orchestrate using oozie  some parts in mr and some in Huve.
>
> On 30. Jan 2018, at 05:15, Sungwoo Park <glap...@gmail.com> wrote:
>
> Hello all,
>
> I wonder if an external YARN container can send requests to LLAP daemon to
> read data from its in-memory cache. For example, YARN containers owned by a
> typical MapReduce job (e.g., TeraSort) could fetch data directly from LLAP
> instead of contacting HDFS. In this scenario, LLAP daemon just serves IO
> requests from YARN containers and does not run its executors to perform
> non-trivial computation.
>
> If this is feasible, LLAP daemon can be shared by all services running in
> the cluster. Any comment would be appreciated. Thanks a lot.
>
> -- Gla Park
>
>

Re: Question on accessing LLAP as data cache from external containers

Reply via email to