Hello Dev-Media!

I'm Ben Bangert, engineer in Cloud Services, leading technical development on 
Push.

This is a general update on the status of simplepush as it pertains to usage by 
loop for call push notifications.

Currently we are proceeding with the assumption that the only viable way to 
alert the client of a call is via simplepush, which loop currently uses with 
our FxOS simplepush cluster. I've talked with Adam Roach about some ways to 
alter the loop client's use slightly so that we can stand up a separate set of 
'simplepush' clusters specifically for use by loop. These separate clusters 
will use a lighter-weight version of simplepush (in development now) that 
doesn't do longer-term state maintenance which is unnecessary for loop's use.

Simplepush, the version currently running, has been benchmarked to about 1 
million clients using a cluster of 5 machines. This cluster required memcached 
as it stored some state that loop's version (loopPush) doesn't require.

Adam indicated that it seems reasonable that 20% of Firefox users might click 
the Loop icon, which would be around 100 million connections. Confirmation or 
refinement of that number would be helpful. For this load we would deploy 
multiple loopPush clusters and provide a URL that the loop client would query 
before initially connecting to determine which clusters have capacity to handle 
more connections.

Remaining work to be done:
- simplepush codebase
  - refactoring already under way for not using memcached
  - additional clean-ups, performance optimizations
  - cluster setup automation

- loads (or some other testing tool)
  - needs dynamic test client handling (we will need over 1,000 instances to be 
spun up for larger scale test runs)
  - ideally some automation integration (so that we can wire it into jenkins 
for automatic runs)

- integrated complete testing (Tokbox, Simplepush, Loop)
Tokbox has contacted us regarding how we plan on handling load testing of the 
push system and ensuring their TURN/STUN servers can meet requirements at 
scale. If we're looking at 100 million people possibly using this, how many are 
behind firewalls that will require complete TURN server proxying? If we have 
some ideas about what capacity Tokbox can/should handle, they'll want to sync 
up with us on that.

We will also need to do some failure-case testing of the loop client to 
determine behavior under various failures in the SimplePush service.

Regarding timelines, we believe it is possible to have a basic service up 
around Firefox 33 hitting the public but as we haven't gotten to a point where 
we've done any load-testing its hard to determine what kind of capacity can be 
handled. I'm also not sure what timeline Loop or Tokbox is operating with when 
it comes to having production deployments ready that can handle 100 million 
users.

Does anyone have roadmaps and timelines for these components?

For those interested, we have the Push Meeting on Weds at 11am PST in the Vidyo 
Services channel.

Cheers,
Ben
_______________________________________________
dev-media mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-media

Reply via email to