Hello Dev-Media! I'm Ben Bangert, engineer in Cloud Services, leading technical development on Push.
This is a general update on the status of simplepush as it pertains to usage by loop for call push notifications. Currently we are proceeding with the assumption that the only viable way to alert the client of a call is via simplepush, which loop currently uses with our FxOS simplepush cluster. I've talked with Adam Roach about some ways to alter the loop client's use slightly so that we can stand up a separate set of 'simplepush' clusters specifically for use by loop. These separate clusters will use a lighter-weight version of simplepush (in development now) that doesn't do longer-term state maintenance which is unnecessary for loop's use. Simplepush, the version currently running, has been benchmarked to about 1 million clients using a cluster of 5 machines. This cluster required memcached as it stored some state that loop's version (loopPush) doesn't require. Adam indicated that it seems reasonable that 20% of Firefox users might click the Loop icon, which would be around 100 million connections. Confirmation or refinement of that number would be helpful. For this load we would deploy multiple loopPush clusters and provide a URL that the loop client would query before initially connecting to determine which clusters have capacity to handle more connections. Remaining work to be done: - simplepush codebase - refactoring already under way for not using memcached - additional clean-ups, performance optimizations - cluster setup automation - loads (or some other testing tool) - needs dynamic test client handling (we will need over 1,000 instances to be spun up for larger scale test runs) - ideally some automation integration (so that we can wire it into jenkins for automatic runs) - integrated complete testing (Tokbox, Simplepush, Loop) Tokbox has contacted us regarding how we plan on handling load testing of the push system and ensuring their TURN/STUN servers can meet requirements at scale. If we're looking at 100 million people possibly using this, how many are behind firewalls that will require complete TURN server proxying? If we have some ideas about what capacity Tokbox can/should handle, they'll want to sync up with us on that. We will also need to do some failure-case testing of the loop client to determine behavior under various failures in the SimplePush service. Regarding timelines, we believe it is possible to have a basic service up around Firefox 33 hitting the public but as we haven't gotten to a point where we've done any load-testing its hard to determine what kind of capacity can be handled. I'm also not sure what timeline Loop or Tokbox is operating with when it comes to having production deployments ready that can handle 100 million users. Does anyone have roadmaps and timelines for these components? For those interested, we have the Push Meeting on Weds at 11am PST in the Vidyo Services channel. Cheers, Ben _______________________________________________ dev-media mailing list [email protected] https://lists.mozilla.org/listinfo/dev-media

