[
https://issues.apache.org/jira/browse/HDFS-14594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16913840#comment-16913840
]
Wei-Chiu Chuang commented on HDFS-14594:
----------------------------------------
I am seeing the same with JDK 8u141, CDH6.3.0. a swebhdfs costs 6 connections.
Additionally, it costs 3 connection to KMS. for encryption zone (kerberos + ssl)
{noformat}
$ strace -T -tt -f hdfs dfs -cat /ez1/f1 2>&1 | grep "sin_port=htons(16000)"
[pid 1137504] 01:27:37.754675 connect(227, {sa_family=AF_INET,
sin_port=htons(16000), sin_addr=inet_addr("172.27.72.129")}, 16) = -1
EINPROGRESS (Operation now in progress) <0.000082>
[pid 1137504] 01:27:37.947628 connect(227, {sa_family=AF_INET,
sin_port=htons(16000), sin_addr=inet_addr("172.27.72.129")}, 16 <unfinished ...>
[pid 1137504] 01:27:37.999086 connect(229, {sa_family=AF_INET,
sin_port=htons(16000), sin_addr=inet_addr("172.27.72.129")}, 16 <unfinished ...>
{noformat}
> Replace all Http(s)URLConnection
> --------------------------------
>
> Key: HDFS-14594
> URL: https://issues.apache.org/jira/browse/HDFS-14594
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: webhdfs
> Affects Versions: 2.7.3
> Environment: HDP 2.6.5 and HDP 2.6.2
> HotSpot 8u192 and 8u92
> Linux Redhat 3.10.0-862.14.4.el7.x86_64
> Reporter: Sebastien Barnoud
> Priority: Major
>
> When authentication is activated there is no keep-alive on http(s)
> connections.
> That's because the JDK Http(s)URLConnection explicitly closes the connection
> after the HTTP 401 that negotiate the authentication.
> This lead to poor performance, especially when encryption is on.
> To see the issue, simply strace and compare the number of connection between
> hdfs implementation and curl:
> {code:java}
> $ strace -T -tt -f hdfs dfs -ls
> swebhdfs://dtltstap009.fr.world.socgen:50470/user 2>&1 | grep
> "sin_port=htons(50470)"
> [pid 92879] 15:11:47.019865 connect(386, {sa_family=AF_INET,
> sin_port=htons(50470), sin_addr=inet_addr("192.163.201.117")}, 16) = -1
> EINPROGRESS (Operation now in progress) <0.000157>
> [pid 92879] 15:11:47.182110 connect(386, {sa_family=AF_INET,
> sin_port=htons(50470), sin_addr=inet_addr("192.163.201.117")}, 16 <unfinished
> ...>
> [pid 92879] 15:11:47.387073 connect(386, {sa_family=AF_INET,
> sin_port=htons(50470), sin_addr=inet_addr("192.163.201.117")}, 16) = -1
> EINPROGRESS (Operation now in progress) <0.000167>
> [pid 92879] 15:11:47.429716 connect(386, {sa_family=AF_INET,
> sin_port=htons(50470), sin_addr=inet_addr("192.163.201.117")}, 16 <unfinished
> ...>
> [pid 93116] 15:11:47.528073 connect(386, {sa_family=AF_INET,
> sin_port=htons(50470), sin_addr=inet_addr("192.163.201.117")}, 16) = -1
> EINPROGRESS (Operation now in progress) <0.000110>
> [pid 93116] 15:11:47.566947 connect(386, {sa_family=AF_INET,
> sin_port=htons(50470), sin_addr=inet_addr("192.163.201.117")}, 16 <unfinished
> ...>
> => 6 connect{code}
> {code:java}
> $ strace -T -tt -f curl --negotiate -u: -v
> https://dtltstap009.fr.world.socgen:50470/webhdfs/v1/user/?op=GETFILESTATUS
> 2>&1 | grep "sin_port=htons(50470)"
> 15:10:53.671358 connect(3, {sa_family=AF_INET, sin_port=htons(50470),
> sin_addr=inet_addr("192.163.201.117")}, 16) = -1 EINPROGRESS (Operation now
> in progress) <0.000118>
> 15:10:53.683513 getpeername(3, {sa_family=AF_INET, sin_port=htons(50470),
> sin_addr=inet_addr("192.163.201.117")}, [16]) = 0 <0.000009>
> 15:10:53.869482 getpeername(3, {sa_family=AF_INET, sin_port=htons(50470),
> sin_addr=inet_addr("192.163.201.117")}, [16]) = 0 <0.000009>
> 15:10:53.869576 getpeername(3, {sa_family=AF_INET, sin_port=htons(50470),
> sin_addr=inet_addr("192.163.201.117")}, [16]) = 0 <0.000008>
> [bash-4.2.46][j:0|h:4961|?:0][2019-06-21
> 15:10:53][dtlprd05@nazare:~/test-hdfs]
> => only one connect{code}
>
> In addition, even without encryption, too many connection are used:
> {code:java}
> $ strace -T -tt -f hdfs dfs -ls
> webhdfs://dtltstap009.fr.world.socgen:50070/user 2>&1 | grep
> "sin_port=htons(50070)"
> [pid 99569] 15:13:13.838257 connect(386, {sa_family=AF_INET,
> sin_port=htons(50070), sin_addr=inet_addr("192.163.201.117")}, 16) = -1
> EINPROGRESS (Operation now in progress) <0.000119>
> [pid 99569] 15:13:13.904255 connect(386, {sa_family=AF_INET,
> sin_port=htons(50070), sin_addr=inet_addr("192.163.201.117")}, 16 <unfinished
> ...>
> [pid 99635] 15:13:14.201236 connect(386, {sa_family=AF_INET,
> sin_port=htons(50070), sin_addr=inet_addr("192.163.201.117")}, 16 <unfinished
> ...>
> => 3 connect{code}
>
> Looking in the JDK code,
> https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/classes/sun/net/www/protocol/http/HttpURLConnection.java
> {code:java}
> serverAuthentication = getServerAuthentication(srvHdr);
> currentServerCredentials = serverAuthentication;
> if (serverAuthentication != null) {
> disconnectWeb();
> redirects++; // don't let things loop ad nauseum
> setCookieHeader();
> continue;
> }{code}
> disconnectWeb() will close the connection (no keep alive reuse)
> Finally we have some unexplained webhdfs command that are stucked in
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375):
> -) for hdfs dfs commands with swebhdfs schema
> -) for some TEZ job using the same implementation for the shuffle service
> when encryption is on
> All other services (typically RPC) are working fine on the cluster.
> It really seams that Http(s)URLConnection causes some issues that Netty or
> HttpClient don't have.
> Regards,
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]