RE: Hive +Tez+LLAP does not have obvious performance improvement than HIVE + Tez

Jia, Ke A Wed, 22 Nov 2017 21:36:07 -0800

Hi Gopal,
Thanks for your reply.
>For the Hadoop version, we will upgrade it to 2.8 later. 
>In our test, we found the shuffle stage of LLAP is very slow. Whether need to 
>configure some related shuffle value or not?  And we get the following log 
>from the LLAP daemon in shuffle stage:
2017-11-23T12:48:39,367 INFO  [New I/O worker #120 ()] 
org.apache.hadoop.hive.llap.shufflehandler.ShuffleHandler: Setting connection 
close header...
2017-11-23T12:48:39,656 INFO  [New I/O worker #121 ()] 
org.apache.hadoop.hive.llap.shufflehandler.ShuffleHandler: Setting connection 
close header...
2017-11-23T12:48:39,698 INFO  [New I/O worker #122 ()] 
org.apache.hadoop.hive.llap.shufflehandler.ShuffleHandler: Setting connection 
close header...
2017-11-23T12:48:39,849 INFO  [New I/O worker #123 ()] 
org.apache.hadoop.hive.llap.shufflehandler.ShuffleHandler: Setting connection 
close header...
2017-11-23T12:48:39,859 INFO  [New I/O worker #124 ()] 
org.apache.hadoop.hive.llap.shufflehandler.ShuffleHandler: Setting connection 
close header...
2017-11-23T12:48:40,037 INFO  [New I/O worker #125 ()] 
org.apache.hadoop.hive.llap.shufflehandler.ShuffleHandler: Setting connection 
close header...
2017-11-23T12:48:40,141 INFO  [New I/O worker #126 ()] 
org.apache.hadoop.hive.llap.shufflehandler.ShuffleHandler: Setting connection 
close header...
2017-11-23T12:48:40,298 INFO  [New I/O worker #127 ()] 
org.apache.hadoop.hive.llap.shufflehandler.ShuffleHandler: Setting connection 
close header...
2017-11-23T12:48:40,718 INFO  [New I/O worker #128 ()] 
org.apache.hadoop.hive.llap.shufflehandler.ShuffleHandler: Setting connection 
close header...


>Now " hive.llap.execution.mode" have "auto,none,all,map,only" mode. About the 
>four mode, do you have some suggestions? Whether the "all" mode can gain the 
>best performance or not? And  how the "auto" and "only" mode work?

Regards,
Jia Ke
-----Original Message-----
From: Gopal Vijayaraghavan [mailto:gop...@apache.org] 
Sent: Thursday, November 23, 2017 4:03 AM
To: user@hive.apache.org
Subject: Re: Hive +Tez+LLAP does not have obvious performance improvement than 
HIVE + Tez

Hi,

> With these configurations,  the cpu utilization of llap is very low.

Low CPU usage has been observed with LLAP due to RPC starvation.

I'm going to assume that the build you're testing is a raw Hadoop 2.7.3 with no 
additional patches?

Hadoop-RPC is single-threaded & has a single mutex lock in the 2.7.x branch, 
which is fixed in 2.8.

Can you confirm if you have backported either 

https://issues.apache.org/jira/browse/HADOOP-11772
or 
https://issues.apache.org/jira/browse/HADOOP-12475

to your Hadoop implementation?

The secondary IO starvation comes from a series of HDFS performance problems 
which are easily worked around. Here's the root-cause

https://issues.apache.org/jira/browse/HDFS-9146

The current workaround is to make sure that the HDFS & Hive user has a limits.d 
entry to allow it to open a large number of sockets (which are fds).

https://github.com/apache/ambari/blob/trunk/ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/templates/hdfs.conf.j2
+
https://github.com/apache/ambari/blob/trunk/ambari-server/src/main/resources/common-services/HIVE/2.1.0.3.0/package/templates/hive.conf.j2

This increases the FD limit for Hive & HDFS users (YARN also needs it, in case 
of Tez due to shuffle being served out of the NodeManager).

After increasing the FDs, LLAP is fast enough to run through 128 socket 
openings within the Linux TCP MSL (60 seconds)

The RHEL default for somaxconn  is 128, which causes 120s timeouts when HDFS 
silently loses packets & forces the packet timeout to expire before retrying.

To know whether the problem has already happened, check the SNMP traps

# netstat -s | grep "overflow"

<n> times the listen queue of a socket overflowed

Or to know when the SYN flood issue has been worked around by the kernel with 
cookies.

# dmesg | grep cookies

After this, you get hit by the DNS starvation within LLAP where the DNS server 
traffic (port 53 UDP) gets lost (or the DNS server bans an IP due to massive 
number of packets).

This is a JDK internal detail, which ignores the DNS actual TTL values, which 
can be worked around by running nscd or sssd on the host to cache dns lookups 
without generating UDP network packets constantly.

If you need more detail on any of these, ask away. I've had to report and get 
backports for several of these issues into HDP (mostly because perf issues are 
not generally community backports & whatever has good workarounds remain off 
the priority lists).

Cheers,
Gopal

RE: Hive +Tez+LLAP does not have obvious performance improvement than HIVE + Tez

Reply via email to