Hi Gopal,
Thanks for your reply.
> A first step would be to check if LLAP cache is actually being used (the LLAP 
> IO in the explain), vectorization is being used (llap, vectorized for tasks), 
> that the column stats show as COMPLETE (instead of NONE).
1. For the LLAP cache, we have enable the LLAP cache by setting " 
hive.llap.io.enabled" to true and the explain shows " LLAP IO: all inputs ". we 
only set the IO cache size 20g, whether suitable?
2. For the vectorization, we set " hive.vectorized.execution.enabled=true" and 
the explain shows " Execution mode: vectorized, llap "
3. For " the column stats show as COMPLETE ",  our explain shows " Basic stats: 
COMPLETE Column stats: COMPLETE "

> Whether the llap container size, jvm heap size ,cache size, executors and the 
> iothreads have recommended value,? Now in llap daemon, we set the container 
> size is 210g and the jvm heap size is 180g and the cache size is 20g and the 
> executors and io thread is 81(the cpu vcores of yarn is 81). Does these 
> configuration is suitable?  With these configurations,  the cpu utilization 
> of llap is very low.

Regards,
Jia Ke


-----Original Message-----
From: Gopal Vijayaraghavan [mailto:gop...@apache.org] 
Sent: Wednesday, November 22, 2017 11:20 AM
To: user@hive.apache.org
Subject: Re: Hive +Tez+LLAP does not have obvious performance improvement than 
HIVE + Tez

Hi,

>  Please help us find whether we use the wrong configuration. Thanks for your 
> help.

Since there are no details, I'm not sure what configuration you are discussing 
here.

A first step would be to check if LLAP cache is actually being used (the LLAP 
IO in the explain), vectorization is being used (llap, vectorized for tasks), 
that the column stats show as COMPLETE (instead of NONE).

Here's some basic config defaults LLAP in an HDP install ships with 

https://github.com/apache/ambari/blob/trunk/ambari-server/src/main/resources/common-services/HIVE/2.1.0.3.0/configuration/hive-interactive-site.xml

You're probably in for a fairly long configuration journey - in the HDP 
install, we've got almost ~2x perf gains in some queries with by using Log4J2 
async logging (but only for LLAP, it is sync logging within HiveServer2).

These configs are all driven by the installer, because Hive only contains logj 
.template files in the release tarballs.

Cheers,
Gopal





Reply via email to