Thanks for the proposal Joseph! I think I'm also leaning toward the circular buffer approach. The one real concern there seems to be the potential for "DDoS"-type scenarios when users hit the subscriber limit using clients which have retry logic. Providing a metric for the number of currently connected subscribers will hopefully help operators avoid this.
The default value for a new flag limiting subscriber count should be very high (MAX_INT??) to maintain current behavior. What do other folks think about this approach? Joseph's draft review is here: https://reviews.apache.org/r/69307/ Greg On Wed, Nov 14, 2018 at 6:35 PM Joseph Wu <jos...@mesosphere.io> wrote: > Heartbeats are currently the least-liked solution, for precisely the > reason BenM stated. Clients of the API, such as the maintainers of the > DC/OS UI, would also like to avoid making more connections than necessary > and/or keeping additional state between connections. > > > Currently, I am leaning towards keeping subscribers in a circular buffer. > This solution is minimal in the code footprint and requires no client-side > changes besides heavily incentivizing retry logic (which we already expect > in most cases). > One potential downside is having more subscribers than the (master flag) > configured maximum. In this case, each client would kick out the first > few; which would then retry and kick out the next few, etc. Each retry is > equivalent to a GET /master/state, and the extra calls would basically > erase the performance gains we have from streaming the events. > > Nevertheless, I think a reasonably high default would have minimal impact > on both master performance and client connectivity. The code for this > proposal can be found here: > https://reviews.apache.org/r/69307/ (Just one review) > > On Sun, Nov 11, 2018 at 9:22 AM Benjamin Mahler <bmah...@apache.org> > wrote: > >> > - We can add heartbeats to the SUBSCRIBE call. >> > This would need to be >> > part of a separate operator Call, because one platform (browsers) that >> > might subscribe to the master does not support two-way streaming. >> >> This doesn't make sense to me, the heartbeats should still be part of the >> same connection (request and response are infinite and heartbeating) by >> default. Splitting into a separate call is messy and shouldn't be what we >> force everyone to do, it should only be done in cases that it's impossible >> to use a single connection (e.g. browsers). >> >> On Sat, Nov 10, 2018 at 12:03 AM Joseph Wu <jos...@mesosphere.io> wrote: >> >>> Hi all, >>> >>> During some internal scale testing, we noticed that, when Mesos streaming >>> endpoints are accessed via certain proxies (or load balancers), the >>> proxies >>> might not close connections after they are complete. For the Mesos >>> master, >>> which only has the /api/v1 SUBSCRIBE streaming endpoint, this can >>> generate >>> unnecessary authorization requests and affects performance. >>> >>> We are considering a few potential solutions: >>> >>> - We can add heartbeats to the SUBSCRIBE call. This would need to be >>> part of a separate operator Call, because one platform (browsers) that >>> might subscribe to the master does not support two-way streaming. >>> - We can add (optional) arguments to the SUBSCRIBE call, which tells >>> the >>> master to disconnect it after a while. And the client would have to >>> remake >>> the connection every so often. >>> - We can change the master to hold subscribers in a circular buffer, >>> and >>> disconnect the oldest ones if there are too many connections. >>> >>> We're tracking progress on this issue here: >>> https://issues.apache.org/jira/browse/MESOS-9258 >>> Some prototypes of the code changes involved are also linked in the JIRA. >>> >>> Please chime in if you have any suggestions or if any of these options >>> would be undesirable/bad, >>> ~Joseph >>> >>