[
https://issues.apache.org/jira/browse/MESOS-1706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14175471#comment-14175471
]
Niklas Quarfot Nielsen commented on MESOS-1706:
-----------------------------------------------
Quick comment (and I am probably going to close this ticket): the benchmark did
not use link(), in which case things works fine. We can still consider to keep
connections open for a configurable number of seconds to reuse them, but should
not have a large performance impact taken that frameworks, masters and slaves
use link().
> Introduce socket / connection pooling to libprocess
> ---------------------------------------------------
>
> Key: MESOS-1706
> URL: https://issues.apache.org/jira/browse/MESOS-1706
> Project: Mesos
> Issue Type: Improvement
> Components: libprocess
> Reporter: Niklas Quarfot Nielsen
>
> Just wrote a libprocess connection throughput stress test (basically two
> libprocess programs sending messsages back and forth). One end is multihomed
> so we can scale up the number of clients.
> The throughput with a single client (10 "concurrent" connections or rather,
> send up to 10 message before awaiting responses) is roughly 8000 - 9000
> requests per second.
> I think I (accidentially) produced more load (around 30.000 requests per
> second) - but I am running into one particular error in both cases: `Failed
> to send, connect: Cannot assign requested address`. According to
> http://khanna111.com/articles/TCPAAIU.html - it seems the only way around it
> is the some kind of connection pooling (we already use SO_REUSEADDR).
> It happens during connect() and hints that the machine is running out of
> available ports on the sender end (when getting randomly assigned ports).
> {code}
> I0815 07:03:49.348409 30317 main.cpp:109] 8984.79 requests / second (delta:
> 1.000356864secs)
> I0815 07:03:50.348898 30320 main.cpp:109] 8715.88 requests / second (delta:
> 1.000473088secs)
> I0815 07:03:51.349040 30317 main.cpp:109] 8622.64 requests / second (delta:
> 1.000157184secs)
> I0815 07:03:52.349184 30320 main.cpp:109] 9039.69 requests / second (delta:
> 1.000144896secs)
> I0815 07:03:53.349478 30319 main.cpp:109] 8768.42 requests / second (delta:
> 1.000293888secs)
> I0815 07:03:54.349954 30322 main.cpp:109] 8728.9 requests / second (delta:
> 1.000470016secs)
> I0815 07:03:55.350334 30316 main.cpp:109] 8628.79 requests / second (delta:
> 1.000371968secs)
> I0815 07:03:56.350957 30320 main.cpp:109] 8726.57 requests / second (delta:
> 1.000621824secs)
> I0815 07:03:57.351474 30318 main.cpp:109] 8587.46 requests / second (delta:
> 1.000529152secs)
> I0815 07:03:58.351805 30314 main.cpp:109] 8475.16 requests / second (delta:
> 1.000335104secs)
> F0815 07:03:59.092653 30323 process.cpp:2197] Failed to send, connect: Cannot
> assign requested address [99]
> *** Check failure stack trace: ***
> Aborted
> {code}
> One way to deal with it couple be to introduce the notion of connection
> pooling.
> Any thoughts?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)