Hi folks,

The Hello clients (desktop and mobile) had trouble making calls over the weekend and for part of today. This was due to a problem in the LoopPush server update. I asked Ben Bengert from the Push Server team to write a summary of what happened:

   On Friday around 11am PST, ops deployed pushgo 1.4rc5 to production
   SimplePush and production LoopPush. Errors started occurring almost
   immediately. Several hours later a hotfix (1.4rc6) was deployed to
   remedy the error. This fix apparently resolved the notification
   delivery errors. It was discovered Saturday that the errors had
   returned, on Sunday a new fix was made based on some analysis of the
   code involved. On Monday morning 1.4rc7 was deployed to production
   SimplePush that has thus far remedied the issue.

   The pushgo 1.4 series replaces the prior system in how it handles
   inter-node notification routing (amongst many other changes). The
   new system uses a peer discovery system backed by etcd such that
   each server in the cluster registers itself and then queries etcd to
   discover its peers. Due to a bug in how network failures were
   handled, if an attempt to check for peers in failed pushgo would
   wipe its known list of peers entirely. A similar bug in error
   handling resulted in servers that failed to re-register their
   presence in etcd being removed from etcd. Fixes for these bugs are
   in 1.4rc7 and have held up in the hours since deployed with no
   losses in peer visibility.

   There were several problems in process leading up to this
   deployment, as it was not intended to be deployed to production
   simplepush. The Bugzilla ticket in question (#1097324) indicated
   deployment should occur for both production simplepush and loop-push
   when it should have only been deployed for loop-push. We are
   currently conducting a more thorough postmortem of the issue to
   determine appropriate steps to prevent unintended deployments like
   this from occurring again.

--
Maire Reavy <[email protected]>
Mozilla

_______________________________________________
dev-media mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-media

Reply via email to