[
https://issues.apache.org/jira/browse/HADOOP-13144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16814930#comment-16814930
]
Íñigo Goiri edited comment on HADOOP-13144 at 4/10/19 11:14 PM:
----------------------------------------------------------------
Thanks [~vinayrpet] for the comments.
>From my experiments, the Namenode is not the main issue and it can absorb the
>load as it has enough readers and handlers.
Security is also disabled.
I added [^HADOOP-13144-performance.patch] with the whole setup.
It also includes the full proposal including the Router changes.
I have to say the results are not as large as I remember and very sensitive to
the order.
After running 20 times and excluding warm up times, we can see:
# A single Router using the old approach (single connection pool): 483.5 ms
# A single Router using multiple users: 455.2 ms
# A single Router using the new approach (multiple connections): 458.5 ms.
# Multiple Routers using the old approach: 483.5 ms.
One can see that 2 and 3 are pretty similar as one would expect from the
analysis.
I'm a little surprised with number 4 though.
These results are not as spectacular as what I observed in HDFS-14316.
My guess is when everything is fine, there are no much difference.
However, it gets much worse when there are time outs.
I think in that case, Routers are stuck at creating connections and that's when
one sees the worst impact of having just one socket factory.
was (Author: elgoiri):
Thanks [~vinayrpet] for the comments.
>From my experiments, the Namenode is not the main issue and it can absorb the
>load as it has enough readers and handlers.
Security is also disabled.
I added [^HADOOP-13144-performance.patch] with the whole setup.
It also includes the full proposal including the Router changes.
I have to say the results are not as large as I remember and very sensitive to
the order.
After running 20 times and excluding warm up times, we can see:
# A single Router using the old approach (single connection pool): 483.5 ms
# A single Router using multiple users: 455.2 ms
# A single Router using the new approach (multiple connections): 458.5 ms.
# Multiple Routers using the old approach: 483.5 ms.
One can see that 2 and 3 are pretty similar as one would expect from the
analysis.
I'm a little surprised with number 4 though.
These results are not as spectacular as what I observed in HDFS-14316.
My guess is when everything is fine, there are no much different but it gets
much worse when there are time outs.
I think in that case, Routers are stuck at creating connections and that's when
one sees the worst impact of having just one socket factory.
> Enhancing IPC client throughput via multiple connections per user
> -----------------------------------------------------------------
>
> Key: HADOOP-13144
> URL: https://issues.apache.org/jira/browse/HADOOP-13144
> Project: Hadoop Common
> Issue Type: Improvement
> Components: ipc
> Reporter: Jason Kace
> Assignee: Íñigo Goiri
> Priority: Minor
> Attachments: HADOOP-13144-performance.patch, HADOOP-13144.000.patch,
> HADOOP-13144.001.patch, HADOOP-13144.002.patch, HADOOP-13144.003.patch
>
>
> The generic IPC client ({{org.apache.hadoop.ipc.Client}}) utilizes a single
> connection thread for each {{ConnectionId}}. The {{ConnectionId}} is unique
> to the connection's remote address, ticket and protocol. Each ConnectionId
> is 1:1 mapped to a connection thread by the client via a map cache.
> The result is to serialize all IPC read/write activity through a single
> thread for a each user/ticket + address. If a single user makes repeated
> calls (1k-100k/sec) to the same destination, the IPC client becomes a
> bottleneck.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]